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Abstract 

There has been an increasing interest in testing the equality of large Pearson’s correlation 
matrices. However, in many applications it is more important to test the equality of large rank- 
based correlation matrices since they are more robust to outliers and nonlinearity. Unlike the 
Pearson’s case, testing the equality of large rank-based statistics has not been well explored and 
requires us to develop new methods and theory. In this paper, we provide a framework for test¬ 
ing the equality of two large U-statistic based correlation matrices, which include the rank-based 
correlation matrices as special cases. Our approach exploits extreme value statistics and the 
Jackknife estimator for uncertainty assessment and is valid under a fully nonparametric model. 
Theoretically, we develop a theory for testing the equality of U-statistic based correlation ma¬ 
trices. We then apply this theory to study the problem of testing large Kendall’s tau correlation 
matrices and demonstrate its optimality. For proving this optimality, a novel construction of 
least favourable distributions is developed for the correlation matrix comparison. 

Keyword: Extreme value type I distribution; U-statistics; Hypothesis testing; Kendall’s tau; 
Jackknife variance estimator. 

1 Introduction 

Let X = (X \,..., Xd) T and Y = (Y \,..., YJ) T be two d-dimensional random vectors. We denote 
X \,..., X ni to be ni independent samples of X and Yi,..., Y n2 to be 112 independent samples of Y. 
Letting n := max{ni,n2}, we aim to test the equality of the U-statistic based correlation matrices 
(e.g, Kendall’s tau or Spearman’s rho) of X and Y. We consider the high dimensional regime 
that d, n —> oo and d/n does not necessarily go to zero as n —> oo. This problem has important 
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applications, including portfolio selection (Markowitz, 1991), high dimensional discriminant analysis 
(Han et al., 2013; Mai and Zou, 2013) and gene selection (Ho et al., 2008; Hu et ah, 2009, 2010). 

When d/n —> 0, Anderson (2003) and Muirhead (2009) study the problem of testing the equality 
of two Pearson’s correlation matrices. Major test criteria include the likelihood ratio (Anderson, 
2003), spectral norm of difference (Roy, 1957) and Frobenius norm of difference (Nagao, 1973). 
When d/n -/$ 0, the likelihood ratio test and the tests in Roy (1957) and Nagao (1973) perform 
poorly, as the Pearson’s sample correlation matrices no longer converge to their population coun¬ 
terparts under the spectral norm (Bai and Yin, 1993). A line of research aims to correct the 
aforementioned tests or proposing new methods. For the likelihood ratio test, Bai et ah (2009) in¬ 
troduce a corrected LRT test which works when d/n ->c£(0,1), and Jiang et ah (2012) generalize 
it to the case when d < n and c = 1. As a generalization of Nagao’s proposal, Schott (2007) and Li 
and Chen (2012) propose new test statistics based on an unbiased estimator of the Frobenius norm 
of the matrix difference, and Srivastava and Yanagihara (2010) propose another test statistic based 
on the difference of two Frobenius norms. Recently, Cai et ah (2013) propose a method based on 
the sup-norm of the matrix difference and prove its rate optimality under a sparse alternative. 

In many applications, it is more meaningful to test the equality of two rank-based correlation 
matrices but instead of the Pearson’s correlation matrices. In particular, Embrechts et ah (2003) 
point out that the Pearson’s correlation coefficient “might prove very misleading” in measuring the 
dependence and advocate the usage of rank correlation coefficients, such as Kendall’s tau (Kendall, 
1938) or Spearman’s rho (Spearman, 1904). Though testing the equality of high dimensional rank- 
based correlation matrices is of fundamental importance, there has been very little work in this 
area. To bridge this gap, this paper proposes a unified framework for testing the equality of two 
large U-statistic based correlation matrices Ui and U 2 , which includes the rank-based correlation 
matrices as special examples. More specifically, let Ui = (uijj) be a type of correlation matrix of 
X and all the elements of Ui be U-statistics 1 . Similarly to Ui, we define U 2 = (u 2 .ij) to be the 
same kind of U-statistic based correlation matrix of Y. In this paper, we aim to test the hypothesis 

H 0 :Ui = U 2 v.s. Hi :Ui + U 2 . (1.1) 

Testing (1.1) plays an important role in many fields. For example, testing the equality of two 
Kendall’s tau correlation matrices U[ and Uj, 

H5:U[ = U5 v.s. H^U^UJ, (1.2) 

can be used to test the model of copula discriminant analysis (Han et al., 2013; Mai and Zou, 2013). 

There are 4 major contributions of this paper. First, for the first time in the literature, we 
develop a unified framework for testing the equality of two large U-statistic based correlation 
matrices. This framework builds upon a fully nonparametric model and enables us to conduct 
homogeneity tests using a wide range of correlation measures. Secondly, as a special example, we 
examine the problem of testing the equality of two large Kendall’s tau matrices and prove the 
minimax optimality of the proposed method. Thirdly, we further propose alternative approaches 
for testing UJ = UJ, which attain better empirical performance than the Jackknife based one. 

1 Such U-statistic based correlation measures are quite general. For example, U\j,j can represent the Kendall’s tau 
correlation coefficient between X l and Xj. 
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Finally, to develop a theory of testing the equality of general U-statistic based correlation matrices, 
we develop an upper bound of Jackknife variance estimation error, which enables us to obtain 
the explicit rate of convergence. For Kendall’s tau matrices, we prove an upper bound of the 
traditional plug-in variance estimation error and an upper bound of the variance difference between 
two Kendall’s tau correlation coefficients. These upper bounds allow us to exploit extreme value 
theory under the dependent setting to prove theorems in this paper. Their constructions are 
nontrivial and are of independent technical interest. To prove the optimality of the proposed testing 
methods for Kendall’s tau matrices, we construct a collection of least favorable distributions with 
regard to the test hypothesis. This construction technique is novel and tailored for testing the 
equality of correlation matrices. In contrast, the construction in Cai et al. (2013) only perturbs the 
diagonal elements of the covariance, which does not affect the resulting correlation matrices. 

1.1 More Related Works 

Apart from the Pearson’s correlation coefficient and general U-statistic based correlation mea¬ 
surements studied in this paper, existing literature also considers other measures of dependence. 
These include the distance correlation (Szekely et al., 2007) and randomized dependence coefficient 
(Lopez-Paz et al., 2013). To the best of our knowledge, there is no work discussing testing the 
equality of dependence structure with regard to these dependent measures. 

Our work is closely related to the random matrix theory on rank correlation matrices. Bai and 
Zhou (2008), Zhou (2007) and Bao et al. (2013) study the theoretical properties of large Spearman’s 
rho correlation matrix UK Specifically, Bai and Zhou (2008) prove the Marchenko-Pastur law for 
the limiting spectral distribution of UK Zhou (2007) prove the extreme value type I distribution for 
ma Xkj |U^.|, and Bao et al. (2013) derive the limiting distributions of traces of all higher moments 
of UK Most of these results hold only under the independence setting, i.e., the entries of X are 
independent of each other. In contrast, our work focuses on the dependent setting. 

Our work is also related to the robust testing, where the test statistics are robust estimators of 
the Pearson’s covariance/correlation coefficients. These include S-estimators and some robust dis¬ 
persion estimators. We refer to O’Brien (1992), Aslam and Rocke (2005) and the references therein 
for details. Our work is also related to the adaptive estimation of a large correlation/covariance 
matrix (Cai and Liu, 2011) or a large Gaussian (copula) graphical model. See, for example, Bickel 
and Levina (2008), Zhao et al. (2014), Ravikumar et al. (2011) and Liu et al. (2012). 

1.2 Notation 

We denote 11v 11 2 = ( zKjLi v j )^ 2 as the Euclidean norm of a vector v = {v\,... ,Vd) T G M d . For 
a matrix A = (a t j) e M dx,? , we define its spectral norm ||A ||2 := sup|| x |i 2<1 11Ax11 2 and Frobenius 

norm || A||jr := Q |p We define the matrix entrywise sup-norm as ||A|| max := max{| a ij |} ■ 

We use Rank(A) to denote the rank of A. If A is a square matrix, we define Diag(A) to be a 
diagonal matrix with the same main diagonal as A. We use U to denote an identity matrix of size 
d. For two sequences of real numbers {a n } and {b n }, we write a n = 0{b n ) if there exists a constant 
C such that \a n \ < C\b n \ holds for all sufficiently large n, write a n = o(b n ) if a n /b n —> 0, and write 
a„ x b n if there exist constants C > c > 0 such that c\b n \ < \a n \ < C\b n \ for all sufficiently large n. 
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For a square matrix S E W ixd , we use A m i n (S) and A max (S) to denote the minimal and maximal 
eigenvalues of S. For a set B, we use \B\ to denote its cardinality. 

1.3 Paper Organization 

The rest of this paper is organized as follows. Section 2 formalizes the problem, describes a general 
testing procedure and analyzes the theoretical properties (e.g., size and power) of the proposed test. 
In Section 3 we focus on testing large Kendall’s tau matrices, for which we consider two models: a 
fully nonparametric model and a semiparametric Gaussian copula model. Under certain modeling 
assumptions, we propose additional tests for the Kendall’s tau matrices which have better empirical 
performance compared to the general testing procedure. Section 4 provides thorough numerical 
results on both simulated and real equity data. In Section 5 we discuss potential future work. All 
proofs are collected in the appendix and supplementary materials. 

2 A General Procedure for Testing U-Statistic Based Matrices 

This section presents a generic testing method for U-statistic based matrix comparison. In Section 
2.1 we describe the proposed testing procedure. In Section 2.2 we analyze its asymptotic size and 
power. In Section 2.3 we consider comparing a row or column of U-statistic based matrices. 

Let X = (Ad,..., X d ) T and Y = (Yi,..., Y^) T be two d-dimensional random vectors in¬ 
dependent of each other. We denote X \,..., X ni to be random samples from X with Xi = 
(Xu, AQ 2 , • • ■, X id ) T . Similarly, Yi, ..., Y n2 are random samples from Y with Y t = (Y t \, Y i2 , ..., Y id ) T 
For i,j = l,...,q, let &ij be a m-th order kernel function defined as 

<f>ij : x • • • x —> M with the symmetric property : <f>jj = j 2 3 . (2-1) 

'-V-' 

m 

Thus, we have a family of functions {dhj, 1 < i.j < q]. Furthermore, each is a symmetric Borel 
measurable function with kernel order m fixed'h We assume that is uniformly bounded. Many 
useful U-statistics satisfy these conditions. We set 

E 

l<h<-<lm<ni 

E ,Y tm ). 

l<h <-<e m <n 2 

We then define the following U-statistic based matrices U a E W ixq for a = 1,2: 

Ui := and U 2 := (u 2 (2.2) 

Correspondingly, we use U a := (u a ,ij)i<i,j< q to denote the expectation of U a , i.e., u a ,ij = E[n aj jj]. 
We can view Ui and U 2 as a type of correlation matrices of X and Y. We are interested in 

2 The symmetry requirement can be easily relaxed. Here we pose this constraint merely for presentation clearness. 

3 We assume each has the same fixed kernel order m for presentation clearness. It is straightforward to extend 
to the setting that m’s are uniformly bounded. 
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testing the equality of Ui and U 2 , which includes testing the equality of two large Kendall’s tau 
or Spearman’s rho correlation matrices. 

We note that q is the row and column number of U a and U a , while d is the dimension of X and 
Y. q and d can be different. Therefore, the framework considered in this paper is quite general. For 
example, it allows U a to represent the dependence structure on a dimension reduced data, where 
the dimension reduction step is incorporated in the kernel function {«T>^, 1 < i, j < q}. 


Remark 2.1. We can relax <5^ to be an asymmetric kernel function without loss of generality. 
Specifically, an asymmetric kernel <!>(•) gives a U-statistic 


u = 


ml \m 


n 


-1 




$(X, 


*n' 


,X. 


where the summation is taken over all combinations of distinct elements {fi,..., l m } from {1 ,,n}. 
Using Hoeffding’s method (Hoeffding, 1948), u is also a U-statistic of the symmetric kernel 

$°(xi,x 2 ,...,x m ) = — V'$(x ai ," - ,x a ), 
ml z —' 

where the summation is taken over all permutations of {1,..., m}. 

The Kendall’s tau matrix is an example of the U-statistic based matrix defined in (2.2). More 
specifically, we set 


®ij(X k ,Xi) = sign(X ki - X fi )sign(X fej - - X £j ), ®ij{Y k , Yg) = sign(K fei - Y ei )sign(Y kj - Y ej ), 

and q = d. The Kendall’s tau sample correlation coefficients T\ and T 2 tlJ are then defined as 

2 


r l ,ij ■ — 


r 2 ,ij — 


ni(m - 


— ^2 sign(X ki - Xgi)sign(X kj - Xgj), 


l<k<£<rii 


Z ^ 

( _ ^ sign(Y ki -Y £i )sign(Y kj -Y £j ). 


l<fc<^<722 


Their population counterparts are T a ^j := ^\ja,ij\ for a = 1,2. We then write the empirical and 
population Kendall’s tau matrix as 

U T a = and U T a = (r 0>ij -), (2.3) 

where a = 1,2. In Section 3, we aim to test the equality of two large Kendall’s tau matrices. 


2.1 A General Testing Procedure 

For testing (1.1) in high dimensions, we use the sup-norm criterion. Such a choice is motivated by 
the fact that the sup-norm is very sensitive to perturbations on a small number of entries compared 
to the null hypothesis. Accordingly, we propose the test statistic: 

M n := max with := ^ 1,1 j ^ —- for 1 <i,j<q. (2.4) 

1 <i,j<q cr{ui,ij) + a 2 (u 2 ,ij) 
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(2.5) 


In (2.4), d 2 (ui'ij) is a Jackknife estimator of ui^j’s variance and is defined as 




m 2 (ni — 1 ) 

n\{n\ — m) 2 ' 

0=1 




with 


Qla,ij ■ — 


ni — 1 

m — 1 


-l 


E ^(^,X < 1 ,..,X <m _ 1 ). 


£j^a,j =!,••• ,m —1 


The definition of (T 2 (u- 2 ,ij) is similar for Y. 

For a given significance level 0 < a < 1, we construct the test to be 


T Q := 1 {M n >G (a) + 4log q — log(log q )}, (2.6) 

where G~(a) := —log( 87 r) — 21og ( — log(l — a)). We reject Ho in (1.1) if and only if T Q = 1. 

In some applications, our interest is to compare a particular row or column of matrices, i.e., we 
aim at testing the hypothesis: 


H 0 ,i : = u 2 ,i* v.s. Hi i : u M * / u 2) **, (2.7) 

where ui^* and u 2 i j* are the i-tli rows of Ui and U 2 . To test this hypothesis, we construct a similar 
test statistic M n i = maxi <j< q Mij, and the according test is 

T a ti = t { M n , t > G'-(a) + 21og q- log log g}, (2.8) 

where G'~(a ) := — log( 7 r) — 2 log(— log(l — a)). We reject H 0 .i if and only if T= 1 . 


2.2 Theoretical Properties 


Our main theoretical result is to characterize the limiting null distribution of M n . We further 
analyze the power of the proposed test under a sparse alternative. 

We introduce three assumptions that will be used later. Assumption (Al) specifies the sparsity 
of U = Ui = U 2 . Assumption (A2) specifies the scaling of q, n. Assumption (A3) is a technical 
condition that we impose for obtaining the limiting distribution of M n . In Section 3.2, we will show 
that Assumption (A3) can be further relaxed under a semiparametric Gaussian copula model. 

In detail, for a fixed constant ckq > 0> we define 


suppj(ao) := {l < * < g : > (logg) 1 “° or \u 2 ,ij\ > (logg) 1 Q °}. 

sup Pi (a 0 ) is the set of indices i such that either the i-tli variable of X is highly correlated (|u 0 j ij| > 
(logg) _1 ^“°) with the j-th variable of X , or the i-th variable of Y is highly correlated with the 
j-th variable of Y . We then introduce Assumption (Al) as follows. 

(Al). We assume that there exits a subset T CZ {1,2,..., g} with |T| = o(g) and a constant 
ao > 0 such that for all 7 > 0 , we have 


max 

i<j<gj'7 r 


suppj(ao) 


o(g 7 ). 
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Before stating Assumption (A2), we first introduce some additional notations. We define 



where {i \, is an arbitrary subset of with distinct elements and contains £. 

gij(Yi ) and hij(Yn ) are similarly defined for l = 1, • • • , 712 . We then denote Cl ,ij to be the variance 
oig^Xe), i• e., 



Ci, ij :=E[E[^(X € X,m)\X t y\ =Var(^(^)). (2.10) 

Similarly, we define ( 2 ,ij ■= Var(^(Y^)). 

With these introduced notations, we are now ready to state Assumption (A2). 

(A2). We assume ni x «2 x n. We also assume logq = 0(n e ) for an arbitrary e > 0 and 


Ca,ij > r a > 0 for a = 1,2, where r\ and r 2 are constants which are irrelevant to i or j. 

The condition that ( a ,ij > r a > 0 is mild. It is used to exclude the degenerate cases of U- 
statistics and has been widely used for analyzing U-statistics. 

To describe Assumption (A3), we write 

< s '={(bi) : 1 <i,j<q} and S 0 = {(i,j): 1 < i < q, i G suppj(a 0 )}- (2-11) 

By the definition of So, for any (i,j) £ S \ So, we have \u a ^j\ < (logg) _1_a °. Moreover, we use 
ui t ijk£ and u 2 ,ijk£ to denote E [gij(Xe)g k e(X e )\ and E [gij(Ye)g k e(Ye)\. 

(A3). For any (i,j) and (k,£) in S \ So, we assume u a} ijki = O((logg)~ 1- “ 0 ) for a = 1 or 2. 

Under fully nonparametric models, we note that u a ,ijk£ is estimable (Kliippelberg and Kuhn, 2009). 
Thus it is possible to verify Assumption (A3) in applications. When we test the equality of two 
Kendall’s tau correlation matrices U[ and UJ, under a semiparametric Gaussian copula model, 
Assumption (A3) can be replaced by a simplified condition which is easier to be verified. More 
details are provided in Section 3.2. 

Under the above assumptions, our main theoretical result quantifies the limiting distribution of 
the extreme value statistic M n . 

Theorem 2.2. Assuming (Al), (A2), (A3) hold, under Hq of (1.1), we have 



( 2 . 12 ) 


for any i£l, as n, q —> 00 . Furthermore, (2.12) holds uniformly for all random vectors X and Y 
satisfying Assumptions (Al), (A2) and (A3). 
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Proof. We list a sketch of this proof. The detailed proof is in Appendix C.l. The proof proceeds 
in three steps. 

Step (i) (Sketch). We set d- 2 (u a ,ij) as the Jackknife estimator of u a jj and a 2 {u a ,ij) as the 
true variance of u a ,ij- We then analyze the estimation error of Jackknife variance estimator by 
providing an upper bound of \n a ^ 2 {u a ,ij) — m 2 Ca,ij\, where ( a ,ij is defined in (2.10). The central 
limit theorem for U-statistics (Lemma F.3) implies that m 2 C a ,ij is the limit of n a o 2 {u a ,ij ) as n a 
goes to infinity. This motivates us to define 


Mij 


(ui,ij u 2,ij) 2 
a 2 (ui,ij) +CT 2 (u2,ij) 


and 



(u 1 jj U2,ij) 

m2 Ci,ij/ n i +m 2 ( 2 ,ij/n 2 ' 


(2.13) 


In M^, we use m 2 C, a ,ij/ni to replace a 2 (u a ,ij) of M.\j. 

Using the obtained upper bound of \n a a 2 (u at ij) — m 2 ( a jj |, we then prove that maxi<jj< g 
and maxi<jj< g Mjj have the same limiting distribution, i.e., to obtain Theorem 2.2, it suffices to 
prove that 

lim P(M n — 41ogg + log(logg) < x) = exp ( — exp(— x/2)/\/&k), (2-14) 

n,q^ oo v ' x ' 

where M n := maxi <i j< q 

Step (ii) (Sketch). We use the Hoeffding decomposition (Lemma F.4) to decompose the 
U-statistic u a jj := u a ,ij — Ua,ij- By the definition of u a ,ij, we have E [u a ,ij] = 0. By Hoeffding 
decomposition, we decompose u a ,ij into two pieces. One is the sum of independent and identically 
distributed (i.i.d.) random variables and the other is the residual term. More specifically, we 
decompose u a .ij as 


m 


ni 


Ul,ij — ^ ^ ' hij^Xa) ~\~ 


a =1 


-1 n 2 

m 


A;,j _ij and U2,ij — ^ ' hijfYgf) T 

™ 2 a=l 


-1 


A n 2 ,ij, (2.15) 


where we set 


A m,ij — 


,..., Xi m ) uijj hij{X ^ k )), 

l<ei<h<-<tm<ni k =1 


A n 2 ,ij — 


yi U2,ij yy• 

l<h<£ 2 <...<tm<n 2 k =1 


Apparently, /ijj(X Q )/ni and mY^ = ihij(Y a )/n 2 are terms for the sum of i.i.d. random 

variables and (^) A na ,ij is the residual term. mJ2f 1 =1 hij(X a )/ni and m E«Li hij(Y a )/n 2 are 
approximations of u\, t] and u 2 jj. This motivates us to define 


T . — 
± i3 ■ 


n i ri 2 

E hij(X a )/n i - E hij(Y a )/n 2 

a= 1 a= 1 


and T n := max (Tj ? ) 2 . (2.16) 

1 <i,j<q 


We then prove that the small residual term 1 A riay ij is negligible for our theorem, i.e., to obtain 
Theorem 2.2, it suffices to prove that as n, q —> oo, we have 


P(T n — 41og q + log(logg) < x) —> exp ( — exp(— x/2)/v / 87t) • 


(2.17) 






Step (iii) (Sketch). In the last step, we derive the limiting distribution of T n to prove (2.17). 
T n is the maximum of (Ty) 2 over {1 < i, j < q} and T t j is not independent of each other. Therefore, 
we cannot straightforwardly exploit the extreme value theorem under the independent setting to 
obtain the limiting distribution of T n . To solve this problem, we exploit the normal approximation 
to get the extreme value distribution of {(7?:j) 2 }i<ij<g under the setting that Ty can be dependent 
of each other. The detailed proof is in Appendix C.l. □ 


Theorem 2.2 justifies the size of the proposed test T a in (2.6). It shows that under Ho of 
(1.1), M n — 4 log q + log(log( 7 ) converges weakly to an extreme value Type I distribution with the 
distribution function F(t) = exp ( — exp(t/2)/\/87r)- 

Remark 2.3. Theorem 2.2 provides a unified framework for testing the equality of two large 11- 
statistic based matrices, which include ranked-based correlation matrices as special examples. Our 
test method exploits the Jackknife strategy and extreme value statistics, and it works under a 
fully nonparametric model. Technically, for proving Theorem 2.2, we develop a set of tools for 
analyzing the Jackknife variance estimator defined in (2.5), which is technically nontrivial and is 
of independent interest for analyzing U-statistics in more general settings. 


Next, we analyze the power of T Q . To this end, we first introduce an alternative hypothesis 
characterized by the following set of matrix pairs 


A(C)= (U!,U 2 ) 


max 


1^1 ,ij ^2 ,ij\ 


v / m 2 Ci,y/ni + m 2 C, 2 ,ij/n 2 


> C^logq 


where C > 0 is a constant. The setting that only one entry of Ui and U 2 differentiates large enough 
will make (Ui, U 2 ) G A(C) for some constant C. The next theorem shows that the null hypothesis 
is asymptotically distinguishable from A(4) by T a . In other words, we reject Ho in (1.1) using T a 
with an overwhelming probability if (Ui,U 2 ) G A(4). 


Theorem 2.4. (Power of the Test T Q ) If (A2) is satisfied, as n, q —> oo we have 


inf P(T a = 1) -> 1. 

(Ui,U 2 )eA(4) 


(2.18) 


Remark 2.5. From the above theorem, only one entry of Ui — U 2 has a magnitude more than 
CyJ log q/n is enough for the test T a to correctly reject Ho of (1.1). We don’t impose Assumptions 
(Al) and (A3) to obtain such results. 


2.3 Testing Rows or Columns of Two U-statistic Based Matrices 

In some applications, instead of testing the equality of two full matrices, we are interested in 
testing the equality of a particular row or column of the given matrix pair. This requires us to test 
the hypothesis in (2.7). For simplicity, we only present the result for the row comparison in this 
subsection. The application to the column comparison is straightforward. 

To test the hypothesis in (2.7), we define the test statistic as 

M n { = max Mi-j. 

’ i <i<q 

The following theorem derives the limiting distribution of under the null hypothesis. 
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Theorem 2.6. If the null hypothesis Hq.; in (2.7) and conditions in Theorem 2.2 hold, we have 


F(M nt i - 2 log q + log log q < x) 


exp 


( 1 / 

X \\ 

(-0F eXp( 

~ 2 ) 


(2.19) 


for any given x 6 M, as n, q —> oo. 

The above theorem can be proved in a similar way to Theorem 2.2. 

Remark 2.7. For analyzing the power of we define the following set of vector pairs, 

1^1 ,ij 'U'2,ij\ 


&i*{C) = \ (ui,**, u 2 ,i*) : max - - ---^- 1 — 

L L<j<q y/m z Ci,ij/ni + m z (,2,ij/n2 

This allows us to yield a similar result to Theorem 2.4. 


> 


CVlogg}. 


3 Applications to Testing Large Kendall’s tau Correlation Matrix 

In this section, we focus on testing the equality of two Kendall’s tau matrices U[ and U^. This 
section contains two parts. In the first part, we assume the samples are from a fully nonparametric 
model. Under this model, in addition to the general Jackknife-based approach outlined in the 
last section, we introduce two additional methods for testing (1.2) and analyze their theoretical 
properties (e.g., size and power). In the second part, we assume the samples are generated from a 
Gaussian copula model, under which we can relax Assumption (A3) to a much simplified form. 

Kendall’s tau provides a way to describe the nonlinear relationship between two random vari¬ 
ables. As it is rank-based, it is especially suitable to analyze data from heavy-tailed distributions. 
In this section, we aim to test the equality of two Kendall’s tau matrices. More specifically, we set 

X e ) := sign(Xfcj - X fi )sign(X fej - X ej ), 

$ij(Y k , Ye) := sign (Y ki - Y a )sign(Y kj - Y ej ), 

and q = d. We aim to test whether U[ = UJ. 

3.1 Methods and Theory under Fully Nonparametric Models 

Section 3.1 contains two parts. The first part introduces two additional test procedures tailored for 
testing the equality of Kendall’s tau matrices U[ and U^. The second part proves the theoretical 
properties of the three tests. In addition, we further prove the rate-optimality of the proposed 
tests. Our technical contributions include providing an upper bound of the traditional plug-in 
variance estimation error, which enables us to establish the explicit rate of convergence of the 
plug-in variance estimator. We also prove an upper bound of the variance difference between two 
Kendall’s tau correlation coefficients. These bounds allow us to derive the limiting distribution 
of two additional test statistics. The construction of these bounds requires the nontrivial usage 
of special structures of variance estimators and is of independent interest themselves. Moreover, 
for proving our test methods’ optimality for the Kendall’s tau matrix comparison, we construct a 
collection of least favourable multivariate normal distributions with regard to the test hypothesis. 
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This novel construction technique is developed for the correlation matrix comparison and is one of 
our technical contributions. 

Recall that UJ and U£, defined in (2.3), are symmetric and we have Diag(U^) = Diag(UJ) = 1^. 
Therefore, we don’t need to compare the main diagonals of U£. Hence, we reset S = {( i,j ) : 1 < 
* < j < d} for testing the equality of large Kendall’s tau correlation matrices. The Jackknife-based 
statistic M n in Section 2.1 then becomes 


Ml’ jack := max 


( T 1 r 2 ,ij) 


(ij')eS <7) + (? 2 (T2,ij) ' 


(3.1) 


Here, we still use cr 2 (-) to denote the jackknife variance estimator. Accordingly, we obtain T 


r,jack # 
a • 


X r,ja ck ;= t { M r,jack > q- ^ + 4 log ^ - log(logd)}. 


3.1.1 Three Procedures to Compare Kendall’s tau Matrices 


In this section, we present two additional methods for comparing two Kendall’s tau matrices. We 
start with the introduction of a plug-in method, which directly estimates the variances of {f a ^j } a= 12 
and plugs them into the test statistic. For this, recall that the Kendall’s tau sample correlation 
between two random variable U and V is calculated by 

2 v-^ 

t = —< -tt V sign (Ui - Uj) sign(Ki - Vj), 

n[n — 1 ■' 

v 1 <i<j<n 

where Ui,... ,U n and V±,... ,V n are n random samples from U and V. Let n c be the probability 
of the event that among two members drawn from the sample without replacement, they are 
concordant with each other. In other words, we have 


n c = P((t / 2 - Ui)(V 2 - Vi) > 0). 
Kruskal (1958) proves that the variance of r can be written as 

-n c (i - n c ) +16 1 r? —|(n cc - n 2 ), 


(3.2) 


(3.3) 


n(n — 1 ) n n — 1 

where n cc is the probability of the event that among three members drawn from the sample without 
replacement, the second and third are concordant with the first. In other words, we have 


n cc = 


u 2 - Ui)(v 2 - Vx) > 0] n [u 3 - Ui)(v 3 - Vx) > o]). 


(3.4) 


As n —> 00 , the quantity in (3.3) multiplied by n has the limit 16 (n cc — n 2 ). Motivated by this 
result, we propose the following plug-in estimator 


a 2 plug (r) = ^(U cc -U 2 c ), 


(3.5) 


as an alternative to the Jackknife based one, for estimating the variance of r. Here Ii cc and n c are 
the corresponding U-statistics to estimate n cc and n c . We replace (J 2 (-) in M^ ack with (? 2 lug (-) to 
construct Ml’ plug : 


MI’ plug := max 


( r l ,ij r 2 ,ijY 


i<i<j<d dl lug {T hij ) + ^ lug (r 2 ,ij)' 


(3.6) 
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Accordingly, we construct the plug-in type test Ta’ plug as follows: 

T£ plug := 1 {M^ ,plug > G~(a) + 41og d- log(logd)}. 

In Section 3.1.2, we will provide the theoretical justification for this plug-in procedure. 

Both the theoretical and numerical results indicate that the variance estimation error is a 
key factor influencing the test statistics’ powers. Up to now, we consider two kinds of variance 
estimation procedures (Jackknife based and plug-in based) for testing the equality of two Kendall’s 
tau matrices. To exploit the sparsity of U T , we next propose to use the exact variance under the 
uncorrelated condition (r = 0). We name this procedure as “pseudo method”. It calculates the 
variance of T a ^j by assuming r a .,j = 0. We set <r^ ps and cf| ps as the variances of y/n\T\ and 
under t\ = 0 and T 2 = 0. We also set 


<Ta,ps ■■= hm a a ps for a = 1,2. 

n a —>-oo 


The test statistic becomes 


M T ’ ps 


max — 

1 <i<j<d cr 


gps/ n l + ^2,ps 


2 

/n 2 ' 


Similarly, we construct the test Tq PS : 


(3.7) 


(3.8) 


T£ ps := 1 {M^ ,ps > G~(a) + 41ogd — log(logd)}. (3.9) 

For example, if X and Y are generated from continuous Gaussian copula model, we have 

~2 2(2ni + 5) ^ ~ 2 2(2ri2 T 5) 

CTl ' ps= 9(m-l) a2 ’ ps = 9(n 2 - 1) ‘ 

Correspondingly, we obtain 

2 _ 2 _ 4 

a l,ps - a 2,ps - g- 

Remark 3.1. As long as |2f2 — er^ ps | = o((logd) _1_e ) with an arbitrary e > 0, we can show that 

replacing <t 2 jPS with o^ ps still gives a valid test. More details are provided in Theorem 3.3. 


3.1.2 Theoretical Properties of Three Testing Procedures 

We now present the theoretical properties (size, power, and optimality) of the three tests introduced 
in the former sections. More specifically, we prove their validity under the null hypothesis and 
conduct power analysis similarly to Theorems 2.2 and 2.4. Furthermore, we show that these tests 
are rate optimal against the sparse alternative. 

In the beginning, the following theorem gives the limiting distribution for plug-in and Jackknife 
based test statistics. 


Theorem 3.2. Assuming (Al), (A2) and (A3) hold, under Hq of (1.2), we have 


P(M^’j ack — 41ogd + log(logd) < x) —> exp 
P(M^ ,plug — 41ogd + log(logd) < x) —> exp 


1 , x 

VS exp{ ~2 

1 X 

VS eXP(_ 2 


(3.10) 

(3.11) 
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for any x E M, as n, d —> oo. Furthermore, the results hold uniformly for all X and Y satisfying 

(Al), (A2) and (A3). 


The following theorem gives the limiting distribution of the pseudo method. It holds under an 
additional meta-elliptical (defined in Appendix A) distributional assumption on the data. 

Theorem 3.3. We assume X and Y belong to the meta-elliptical distribution (Fang et ah, 2002) 4 . 
If Assumptions (Al), (A2) and (A3) hold, under Hq in (1.2), we have 

¥(M„’ ps - 4log d + log(log d) < x) -> exp ( - -^L= exp(-|)), (3.12) 

for any x E M, as n, d —> oo. Furthermore, the result holds uniformly for all X and T satisfying 

(Al), (A2), (A3). 

We now analyze the powers of T? ack , Ta’ plug and T„ ps . Similarly to Theorem 2.4, we define 


U(C) 

¥(C) 


(UI.U5) 

(UI.U5) 


max 


|Tl ,ij I 


> C\fk)gd 


i <i<j<d ^/4(i tij /ni + 4C 2 ,ij/n 2 

\n,ij - T 2 ,ij\ ^> C ^k^d 


max 


a lps/ n l + 


They are Kendall’s tau versions of A(C) in Theorem 2.4. 


Theorem 3.4. (Power Analysis) Assuming (A2) holds, we have 


inf 

(U[,U£)eu(4) 

inf 

(U[,U5)eU(4) 


p ( T r,jack = 1 ) 1 5 

(3.13) 

piping = i) i 5 

(3.14) 


as n, d —> oo. If X and Y belong to the meta-elliptical family and (Al), (A2) are satisfied, as 
n, d -> oo, we have 


inf P(TJ ps = 1) -> 1. (3.15) 

(ur,u;)ev(4) 

Theorem 3.4 implies that just one entry of Uj" — has a magnitude no smaller than CyJ log d/n 
is enough for the test T^ to correctly reject Hq. 

Next, we show that all the three proposed methods are rate optimal by matching the obtained 
rates of convergence to a lower bound for correlation matrix comparison. We adopt the general 
framework used in Baraud (2002) for testing the equality of correlation matrices. The core of the 
proof is the construction of collections of least favourable multivariate normal distributions with 
regard to the test hypothesis. Our work is related to Cai et al. (2013) which prove the lower 
bound for testing the equality of covariance matrices. However, their construction technique is 
developed for covariance matrices but not the correlation matrices. Specifically, they only perturb 
the diagonal elements of the covariance, which does not affect the resulting correlation matrices. 
To test correlation matrices, we need to develop a novel construction by perturbing the off-diagonal 
elements of the correlation matrices. More details are provided in the proof of Theorem 3.5. 

4 Detailed introduction of the meta-elliptical distribution family is provided in Appendix A. 
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Theorem 3.5. Let a, (3 > 0 and a + f3 < 1. Assuming that log d/n = o(l), there exits a suffi¬ 
ciently small positive number cq, such that for any distribution family that contains Gaussian as a 
subfamily, and all large enough n and d, we have 

inf sup P (T a = 0) > 1 — a — (3, (3.16) 

T a &T a (U[,UJ)eU(c 0 ) 

where T a represents all level a tests for testing the equality of two matrices. 

Cai et al. (2013) give a similar result for testing the equality of two covariance matrices. They 
show that the rate C\J log d/n is optimal for comparing covariance matrices under conditions that 
X and T have sub-Gaussian-type or polynomial-type tails. In comparison, the lower bound result 
in Theorem 3.5 illustrates that our proposed methods are rate optimal under the general nonpara- 
metric model. In particular, we don’t impose assumptions on the marginal distributions. 

3.2 Methods and Theory under Semiparametric Gaussian Copula Models 

In this section, we assume that X and T are d-dimensional random vectors from the Gaussian 
copula with latent correlation matrices = (cr a ,ij), a = 1,2 and Diag(S a ) = l/\ Under the 
Gaussian copula model, the technical assumption (A3) in Section 2.2 can be replaced by a much 
simplified condition. Specifically, for r £ (0,1), we define 

H(r) := {1 < i < d : \t\^\ > r or \r 2 ,ij\ > r for some j / i}. (3-17) 

We describe the technical assumption (A4) as follows: 

(A4). For some r < 1 and a sequence of numbers Qd,r, we have |U(r)| < = o(d). 

Assumptions (A4) and (A3) are highly related. However, Assumption (A3) cannot be straight¬ 
forwardly implied by Assumptions (Al), (A2) and (A4). In fact, the relationship between (A3) 
and (A4) is complicated. To see the exact relationship, we need some additional definitions. 

First, we have S = {( i,j ) : 1 < i < j < d}. We then define 

Co = {(i,j):ie S2(r) U r} (J«i, i) ■■ i e n(r) |J r} and B 0 — So |^J Go, 

where T is dehned in in Assumption (Al) and So is defined in (2.11). Furthermore, we denote 
A to be the biggest subset of S \ Bo, such that any two pairs ( i,j),(k,£ ) £ A must satisfy a 
condition (*). More detailed description of condition (*) will be provided in the proof of Theorem 
3.6. Essentially, it specifies that, for any ( k,£ ) £ S \ Bq, there exits an i\ £ {i,j,k,£} such 

that for any j\ £ {i,j, k,£} \ i±, we have \r a ,i 1 j 1 \ = 0((log d)~ 1 ~ a °). We also define r a> ijki as the 
Kendall’s tau version of u a i jki in Assumption (A3). 

Under Assumptions (Al), (A2) and (A4), we can prove that for any (k,£) £ A, we have 
\r a .ijke\ = O((logd) -1 ~“ 0 ), which is essentially Assumption (A3) with u a ^kt replaced by r a ^kl- 
The only difference is that these conditions hold on A but instead of S \ So as in Assumption 
(A3). Theorem 3.6 below specifics that Assumptions (Al), (A2) and (A4) can be used to 
replace Assumptions (Al), (A2) and (A3) when we test the equality of Kendall’s tau correlation 
matrices under the Gaussian copula model. 

5 Detailed definition of the Gaussian copula is put in Appendix A. 
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Theorem 3.6. Let X and Y be Gaussian copula random vectors with latent correlation matrices 
X a , a = 1 or 2 and Diag(X a ) = Ij. We assume that the smallest eigenvalue of any 4 by 4 principal 
sub-matrix of X a is bounded away from 0. Assuming (Al), (A2) and (A4) hold, under Hq of 
(1.2), we have 

p(M^’ jack - 4 log d + log (log d) < X s ) ->• exp ( - -^=exp(-|)), 

p ( A C' plug - 41ogd + log(logd) < x) ->• exp ( - -^L=exp(-|)), 

p(M^’ ps - 4 log d + log (log d) < x) ->• exp ( - ^=exp(-|)), 

for any x £ M, as n, d —> oo. Furthermore, these limiting results hold uniformly for all X and Y 
satisfying (Al), (A2), and (A4). 

Proof. Recall that Mff ' Ack . M,I’ plug and Mn ,ps in (3.1), (3.6) and (3.8) are defined by taking the 
maximum over S. The main idea to prove Theorem 3.6 is to show that it is sufficient to use a 
version of these quantities taking the maximum over the smaller set A as defined before. The proof 
is technical and is left to Appendix C.7. □ 

Remark 3.7. In Appendix A, we show that and S a are related in terms of a a ^j = sin(r aj jj7r/2). 
Hence, testing (1.2) is equivalent to testing 

Ho : Si = S 2 v.s. Hi : Si / S 2 , 

under the Gaussian copula model. 

Remark 3.8. To test the row or column of Kendall’s tau matrices, if any of the conditions of 
Theorem 3.2, 3.3 and 3.6 are satisfied, we get the same limiting distribution as in (2.19). 

4 Experiments 

We demonstrate numerical performance of the proposed methods on simulated and real datasets. 
In particular, we compare the proposed methods with the state-of-the-art method in the literature. 

4.1 Numerical Simulations 

We compare the proposed methods with the sample covariance based method (denoted by T^ LX ) 
in Cai et al. (2013). Under the null hypothesis, we sample ni + n 2 data points from the following 
4 models: 

• Model 1 (Gaussian distribution with blocked X) We set X* = (oq.jj) with = 1, G\.ij = 
0.6 for 5 (k — 1) + 1 < i ^ j < 5 k, k = 1,..., |_cZ/5j and (J\.ij = 0 otherwise. In this model, 
under the null hypothesis, we generate ni + 712 random vectors from N( 0, X*). 

• Model 2 (Gaussian copula with marginal f-distribution) We use the same data generating 
mechanism as in Model 1, except that the marginal distributions are transformed into t{y) 
with density r((i/ + 1)/2)(1 + t 2 /^“K+d/ 2 / yj /2). We set v = 3. 
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• Model 3 (Gaussian copula with marginal Cauchy distribution) We use the same data gener¬ 
ating mechanism as in Model 1, except that the marginal distributions are transformed into 
Cauchy(^, s) with density s/'k(s 2 + (x — /r) 2 ). We set fx = 0 and a = 1. 

• Model 4 (Multivariate f-distribution ) We generate the data according to n + Z/yJW/v, 
where we set W ~ X 2 (^) and Z ~ JV(0, S*) with W and Z independent of each other. We 
set /x = 0 and v = 3. 

Under the above models, the two populations of X and Y have the same correlation matrices. We 
use them to show that the proposed methods are valid given a fixed size. For the power analysis, 
we introduce a random matrix A = (Ski) £ M. dxd with exactly 8 nonzero entries. Among these 8 
nonzero entries, 4 entries are randomly selected from the upper triangle of A, with a magnitude 
generated from Unif(0,1). The remaining 4 entries are selected in the lower triangle and are 
determined by symmetry. Under the alternative hypothesis, we use a pair of matrices Si,S 2 to 
generate samples for X and Y in place of £*. We set Si = X* + 51 and S 2 = S* + A + 51 with 
5 = | min{A m i n (Si + A), A m ; n (Si)}| + 0.05. 


Table 1: Empirical sizes and powers for Model 1, 2 with a = 0.05, based on 5000 replications. 






Model 1 





Model 2 



n 

d 

10 

30 

50 

80 

100 

150 

10 

30 

50 

80 

100 

150 

Empirical size 

100 

rpr.plug 

a 

0.05 

0.09 

0.12 

0.15 

0.17 

0.20 

0.05 

0.09 

0.12 

0.16 

0.17 

0.20 


rpr, jack 

^ a 

0.04 

0.06 

0.06 

0.08 

0.10 

0.10 

0.03 

0.06 

0.07 

0.09 

0.10 

0.11 


T' r ?P s 

^ a 

0.02 

0.03 

0.03 

0.03 

0.04 

0.03 

0.02 

0.03 

0.03 

0.04 

0.04 

0.04 


r pCLX 

^ Oi 

0.03 

0.04 

0.04 

0.04 

0.05 

0.04 

0.01 

0.01 

0.01 

0.01 

0.01 

0.01 

200 

rpr.plllg 

a 

0.04 

0.07 

0.08 

0.08 

0.09 

0.10 

0.04 

0.07 

0.07 

0.09 

0.09 

0.10 


'T'rjack 

- 1 Oi 

0.04 

0.05 

0.06 

0.06 

0.06 

0.07 

0.03 

0.05 

0.06 

0.07 

0.06 

0.07 


rpr,p s 

^ a 

0.02 

0.04 

0.04 

0.03 

0.04 

0.04 

0.02 

0.03 

0.04 

0.04 

0.04 

0.04 


rpCLX 

^ Oi 

0.04 

0.04 

0.04 

0.04 

0.05 

0.05 

0.01 

0.01 

0.01 

0.01 

0.01 

0.01 

Empirical power 

100 

rpr.plug 

Oi 

0.93 

0.85 

0.82 

0.78 

0.77 

0.74 

0.94 

0.86 

0.81 

0.78 

0.76 

0.74 


tit, jack 

-*■ Oi 

0.92 

0.81 

0.77 

0.72 

0.70 

0.66 

0.92 

0.83 

0.76 

0.73 

0.70 

0.65 


rpr,p s 

^ Oi 

0.87 

0.73 

0.68 

0.60 

0.57 

0.51 

0.87 

0.75 

0.66 

0.61 

0.57 

0.50 


rpCLX 

^ Oi 

0.65 

0.56 

0.52 

0.45 

0.42 

0.35 

0.27 

0.19 

0.13 

0.10 

0.08 

0.05 

200 

rpr.plug 

Oi 

0.99 

0.98 

0.97 

0.96 

0.96 

0.95 

0.99 

0.97 

0.97 

0.95 

0.95 

0.95 


rnr,jack 

Oi 

0.99 

0.98 

0.96 

0.95 

0.95 

0.94 

0.99 

0.97 

0.96 

0.95 

0.95 

0.94 


rpr,p s 

^ Oi 

0.98 

0.97 

0.95 

0.94 

0.94 

0.92 

0.99 

0.97 

0.95 

0.94 

0.94 

0.92 


rpCLX 

^ Oi 

0.93 

0.94 

0.92 

0.90 

0.90 

0.87 

0.61 

0.61 

0.53 

0.46 

0.42 

0.36 


We set n\ = ri 2 = n with n = 100, 200 and d = 10, 30, 50,80,100,150. The nominal significance 
level is 0.05. Tables 1 and 2 present the empirical performances of these methods. We see that 
To’ ps always attains the desired size. When d is significantly larger than to, Ta’ plug and T^ ack suffer 
from the size distortion. When d approximates to, T^ ack is still valid but Ta plug fails. 

Examining the empirical powers of Tables 1 and 2, the three proposed methods outperform 
Tq LX . In particular, for distributions with heavy tails or strong tail dependence, T^ LX, s powers 
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Table 2: Empirical sizes and powers for Models 3, 4 with a = 0.05, based on 5000 replications. 






Model 3 





Model 4 



n 

d 

10 

30 

50 

80 

100 

150 

10 

30 

50 

80 

100 

150 

Empirical size 

100 

rpr.plug 

a 

0.05 

0.09 

0.12 

0.15 

0.15 

0.21 

0.06 

0.11 

0.12 

0.15 

0.18 

0.21 


'T'rjack 

CL 

0.04 

0.05 

0.07 

0.09 

0.08 

0.11 

0.04 

0.07 

0.08 

0.08 

0.11 

0.12 


rpr,p s 

^ CL 

0.02 

0.03 

0.03 

0.04 

0.03 

0.04 

0.02 

0.03 

0.03 

0.04 

0.04 

0.04 


rpCLX 

CL 

0.00 

0.00 

0.01 

0.01 

0.01 

0.02 

0.01 

0.00 

0.00 

0.00 

0.00 

0.00 

200 

nr,plug 
Cl 

0.04 

0.06 

0.07 

0.09 

0.09 

0.09 

0.04 

0.07 

0.07 

0.09 

0.09 

0.10 


T^r Jack 

^ a 

0.03 

0.05 

0.06 

0.06 

0.07 

0.06 

0.03 

0.05 

0.06 

0.07 

0.07 

0.07 


rpr,p s 

^ CL 

0.02 

0.03 

0.04 

0.04 

0.04 

0.04 

0.02 

0.04 

0.04 

0.04 

0.04 

0.04 


r pCLX 

CL 

0.00 

0.00 

0.00 

0.01 

0.01 

0.02 

0.01 

0.00 

0.00 

0.00 

0.00 

0.00 

Empirical power 

100 

rpr,plug 

CL 

0.94 

0.86 

0.81 

0.78 

0.76 

0.74 

0.87 

0.75 

0.68 

0.65 

0.63 

0.61 


'T'rjack 

- 1 CL 

0.93 

0.82 

0.76 

0.71 

0.69 

0.66 

0.85 

0.70 

0.63 

0.58 

0.55 

0.51 


r nr,ps 

^ CL 

0.88 

0.75 

0.67 

0.59 

0.56 

0.49 

0.76 

0.59 

0.49 

0.41 

0.37 

0.32 


rpCLX 

^ a 

0.00 

0.00 

0.00 

0.01 

0.02 

0.02 

0.09 

0.03 

0.03 

0.01 

0.01 

0.01 

200 

rpr.plug 

Cl 

0.99 

0.98 

0.96 

0.96 

0.96 

0.94 

0.98 

0.95 

0.93 

0.91 

0.91 

0.89 


'T'rjack 

^ a 

0.99 

0.98 

0.96 

0.95 

0.95 

0.93 

0.98 

0.94 

0.92 

0.90 

0.90 

0.87 


T'r,p s 

a 

0.98 

0.97 

0.95 

0.94 

0.94 

0.92 

0.97 

0.94 

0.90 

0.87 

0.86 

0.83 


rpCLX 

^ a 

0.01 

0.00 

0.01 

0.01 

0.01 

0.01 

0.22 

0.13 

0.10 

0.07 

0.06 

0.04 


decrease dramatically, making T^ LX inappropriate for such applications. Under the Gaussian 
distribution, Ta' ps has greater power than T^ LX since it does not need to estimate the variance. 

These finite sample results suggest that Ta’ plug is useful only when d is significantly smaller than 
n. With d approximates n, we recommend to use T^ ack because it has averagely higher power. 
When d is significantly larger than n, we recommend to use Tq PS . 

4.2 Real Data Example 

In this section we analyze the dependence structure of the residual vectors, e t := (e*i, etz, ■ ■ ■, &td) T £ 
M d , of the Fama-French three-factor model (Fama and French, 1993). The Fama-French three-factor 
model is 


fit 'Yf — T /Tii {j'rnt 'Yf') d"~ T T Citi 

where ru is the random return of the zth portfolio at the time t, 7/ is the risk free rate, r m t is the 
return of the market portfolio, SMB< is the portfolio return related to the size factor, and HMLj is 
the portfolio return related to the book-to-market factor. 

We use 30 industry value-weighted portfolios from Kenneth R. French’s Data Libray 6 . According 
to the financial theory, eu represents the individual risk owned by the zth industry. Heuristically 
speaking, most of correlations between different industries should be small after adjusting common 
market factors, as ea represents each industry’s risk. We emphasize that our methods allow the 

6 The website for the data library is http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html. 
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Figure 1: The figure illustrates the p-value sequences for testing the equality of residual vectors’ dependence 
structure from 1991 to 2012. We conduct the tests every half year to compare the covraiance or correlation 
matrices during a one-year period. The tests are based on returns for every two weeks from the stock market 
and the significance lever a is 0.05. 

existence of strong correlations between some industries after adjusting factors, as long as the 
number of such industries is small. Therefore, although it is not reasonable to assume the sparsity 
of Pearson’s or Kendall’s tau correlation matrix of rt = ( ru , ■ ■ • ,^dt ) T , we can accept the sparsity 
of them after adjusting the effects of common factors. 

We use the regression described in Farna and French (1993) to estimate foi, b S i, b v i and test 
on the residuals eu- We use returns of every two weeks to compare the dependence structure of 
residual vectors for a one-year period and test about every half year from 1991 to 2012. Since 
financial data are heavy tailed, it is natural to test the equality of Kendall’s tau rank correlation 
matrices as conducted by x„^ ack and Tq PS . Since the number of portfolios d is close to the number 
of samples to, we adopt T„' ack and T„ ps with T^ LX . We show the results in Figure 1. 

Figure 1 illustrates that T^ LX sometimes diverges from T^ ack and Tj ps . For example, at 
November of 2009, the p-value of T^ LX is 0.0002, while p-values of Ta-' ack and Ta’ ps are 0.2604 and 
0.2904. This phenomenon indicates at this time point the fluctuation of variance rather than the 
correlation causes the rejection of T^ LX . From Figure 1 we find that Ta' ps is more conservative than 
T„'' ack and we also observe that some fluctuations of the dependence structure coincide with the 
economic or market environment changes. For example, from the year 2007 to the year 2008, when 
a global financial crisis happens, T^ ack takes the p-values 0.0342, 0.0156, 0.2657, 0.0113. Tests 
around the year 2000 reveal a similar pattern, which may be related to the “dot-com bubble”. 
However, from the known financial crises, not all crises lead to the fluctuations of the dependence 
structure. This phenomenon illustrates financial crises may influence the equity market differently. 
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5 Summary and Discussion 


This paper considers the problem of testing the equality of high-dimensional U-statistic based 
matrices. We provide a lower bound for testing the equality of correlation matrices and prove the 
proposed methods’ optimality. Based on thorough numerical comparisons, Tjj lus performs well only 
when d is significantly smaller than n. When d is close to n, we recommend to use T J „ ck since it has a 
greater power. When d is much larger than n, we recommend to use Tq S . In addition, Tq S performs 
quite well for distributions with heavy tails or strong tail dependence. Therefore, T„ s is potentially 
more useful for financial applications in which heavy-tailness is a common phenomenon. There 
are many possible future directions of this work. For example, instead of two-sample problems, 
it is interesting to generalize the idea to fc-sample testing problems (k > 2). This may require a 
nontrivial extension of theoretical analysis. 

For testing Kendall’s tau matrices, we show that the variance estimation error is a key factor 
influencing a test procedure’s power. In fact, the test T„ s , which exploits the exact value of variance 
under the uncorrelated condition (r = 0), achieves a better finite-sample performance especially 
when d is much larger than n. We can generalize such idea to many other applications. We also 
provide an upper bound of the Jackknife variance estimation error in the proof of Theorem 2.2. 
This result is also useful for proving other properties of U-statistics. 

Next, we discuss the imposed assumptions. We note that the sparsity assumption (Al) plays 
a key role for obtaining the limiting extreme value distribution. It is not clear on whether this 
assumption is necessary, but it is satisfied in many high-dimensional applications. When (Al) is 
not satisfied, it is possible to exploit the bootstrap method to construct a test statistic. This is 
left as for future investigation. Regarding (A2), we note that Cai et al. (2013) assume a stronger 
scaling assumption: log(d) = o(n 1//5 ). We strengthen this scaling by assuming log(d) = 0{n l ^~ e ) 
for an arbitrary e > 0. This is due to the fact that the U-statistics studied in this paper are assumed 
to have bounded kernels. 

In the simulation studies, we use T^ LX as a comparison benchmark. In Appendix B, we provide 
another heuristic test (denoted by T^) for testing the equality of Pearson’s correlation matrices. 
The performances of T^ LX and T^ are similar. 
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A Introduction to the Meta-Elliptical Distribution 

This section introduces the meta-elliptical distribution, which is essentially the elliptical copula 
family. It does not require the latent correlation matrix must be positive definite. 
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Definition A.l. Let /i, E and S E M dxd with Rank(S) = q < d. A d-dimensional random 
vector X has an elliptical distribution, denoted by X ~ EC d (^, S,£), if it has a stochastic repre¬ 
sentation: X = /x + £AU, where U is a random vector uniformly distributed on the unit sphere in 
M 9 and £ > 0 is a scalar random variable independent of U. A E M rfX9 is a deterministic matrix 
such that AA t = S. 

To define the meta-elliptical distribution, we define a set of symmetric matrices: 

m d = {S' E R dxd : (S') T = S', Diag(S') = I d , S' Y 0}. 

Definition A.2. A continuous random vector X = (Xi,... ,X d ) T E follows a nreta-elliptical 
distribution, denoted by X ~ ME d {Tj, £; /i,..., f d ), if there exist univariate strictly increasing 
functions f\, ■ ■ ■ , fd such that 

(fi(Xi), • • •, / d (X d )f ~ EC d (0 , S', o, 

where S' = (of-) E Tt,/. We call S' the latent generalized correlation matrix. 

The following theorem illustrates an important relationship between the population Kendall’s 
tau coefficient matrix U r and the latent generalized correlation matrix S'. 

Theorem A.3. (Invariance property of Kendall’s tau (Han and Liu, 2012)). If we let X = 
(Xi,..., X d ) T ~ ME d ( S , /i,..., /d), and denote to be the population Kendall’s tau between 

Xj and Xj. we have <j'- = sin(rjj7r/2). 

Theorem A.3 shows that if X and Y follow the meta-elliptical distribution, testing (1.2) is 
equivalent to testing 

Ho : S' x = S i v.s. Hr : Sf + S' 2 . 

If X and Y follow multivariate Gaussian distributions, S{ and S| are Pearson’s correlation coef¬ 
ficient matrices of X and Y . 

B More Simulation Results 

This section provides another heuristic test (denoted by T^) for testing the equality of Pearson’s 
correlation matrices. The performances of T^ LX and T^ are similar and both outperformed by the 
proposed rank-based methods. 

We first explain the construction of T^. We set X := (Xi,..., X d ) T an d Y := (L)...., Y d ) T , 
where Xj = (Xy + X 2 j + ... + X mj )fri] and Yj = (Yy + Y 2 j + ... + X n2 j)/n 2 . We use sij and Y 2j 
to denote the sample standard deviations of Xj and Yj for 1 < j < d. We then construct X,. := 
(Xi,..., X id ) T and Yi := i, — , Y id ) T , where X ljt = (X 8J - Xj)/sij and Y tJ = (1 Y {j - Yj)/s 2j . T^ 
uses the same testing procedure as T^ LX in Cai et al. (2013) except that it is built on the rescaled 
samples {X,;}i<j< ni and {Yi}i<j< n2 rather than {Xj}i<j< ni and {Y)}i<i<n 2 - Table 3 shows that 
the performances of T^ and T^ LX are similar. We note that ideas similar to T^ have also been 
considered in Cai and Zhang (2014) for correlation estimation. 
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Table 3: This table compares T^-’s emprical sizes and powers with those of T^ LX and the three proposed 
tests under the Gaussian distribution in Model l.The result shows that the performances of T^ LX and 
are similar._ 


n 

d 


Empirical size 



Empirical power 


10 

30 

50 

80 

100 

150 

10 

30 

50 

80 

100 

150 

100 

rpr.plug 

a. 

0.05 

0.09 

0.12 

0.15 

0.17 

0.20 

0.93 

0.85 

0.82 

0.78 

0.77 

0.74 


rj^rjack 

^ a 

0.04 

0.06 

0.06 

0.08 

0.10 

0.10 

0.92 

0.81 

0.77 

0.72 

0.70 

0.66 


rpr,p s 
- 1 q; 

0.02 

0.03 

0.03 

0.03 

0.04 

0.03 

0.87 

0.73 

0.68 

0.60 

0.57 

0.51 


rpCLX 

- 1 Ct 

0.03 

0.04 

0.04 

0.04 

0.05 

0.04 

0.65 

0.56 

0.52 

0.45 

0.42 

0.35 


rpR 

-*■ a 

0.02 

0.03 

0.04 

0.04 

0.05 

0.04 

0.61 

0.53 

0.48 

0.42 

0.39 

0.33 

200 

rpr.plllg 

a 

0.04 

0.07 

0.08 

0.08 

0.09 

0.10 

0.99 

0.98 

0.97 

0.96 

0.96 

0.95 


rr'Tjack 

^ a 

0.04 

0.05 

0.06 

0.06 

0.06 

0.07 

0.99 

0.98 

0.96 

0.95 

0.95 

0.94 


r pr,p s 

± a 

0.02 

0.04 

0.04 

0.03 

0.04 

0.04 

0.98 

0.97 

0.95 

0.94 

0.94 

0.92 


rpCLX 

a. 

0.04 

0.04 

0.04 

0.04 

0.05 

0.05 

0.93 

0.94 

0.92 

0.90 

0.90 

0.87 


rpR 
^ a 

0.02 

0.04 

0.04 

0.04 

0.04 

0.05 

0.95 

0.94 

0.92 

0.91 

0.90 

0.88 


C Proofs of Main Theoretical Results 

This section collects proofs of the main theorems. In the sequel, we use C, C\, C 2 , ■ ■to denote 
constants that do not depend on n, d, q and can vary from place to place. 


C.l Proof of Theorem 2.2 


Proof. As explained in the proof sketch, our analysis proceeds in three steps described as below. 

Step (i). In this step, we prove that it is sufficient to establish (2.14) for proving the theorem. 
For this, we need to sharply characterize the estimation error of the Jackknife variance estimator 
of U-statistics. For this, we introduce the following lemma. 

Lemma C.l. Let cr 2 (u a ,ij) be the Jackknife estimator of u a ,ij and a 2 (u a _ij) be the variance of 
u a ,ij ■ Recalling the definition of htj in (2.9), £ 1 ^ and £2 ,ij are the variances of hij(Xg) and hij(Yi). 
We have that m 2 (,a,ij is the limit of n a o 2 {u a ^j) as n a goes to infinity. We also have that m 2 ( a jj is 
the limit of n a a 2 (u at ij) as n a goes to infinity. Moreover, under Assumption (A2), as n, q — > 00 we 
have 

p( max \7i a a 2 (u a ,ij)-m 2 ( a ,ij\> C-^-) = o(l), (C.l) 

\ \_<i.j<q log qJ 

where e n = o(l) and a = 1 , 2 . 


The detailed proof of Lemma C.l is in the supplementary Appendix D.l. Lemma C.l implies 
that both of the following two events 

£\ ■= { max \n\a 2 (u\ ij) - m 2 £i | £ 2 := ( max \n 2 a 2 (u 2i j) ~ m 2 ( 2 ij\ < Cr^l, 

l 1 <i,j<q' 1 logqC l \<i,j<q 1 1 log q J 

happen with probability going to one as n, q —> oo. Under £\ and £ 2 , by £ a ij > r a > 0 (Assumption 
(A2)), we have 


nia 2 (ui^j) / (m 2 Ci, ij) - 1 <Ce n /\ogq and n 2 a 2 {u 2 ,ij) / {m 2 C,2,ij) ~ 1 <Ce n /\ogq. 
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We set M n := maxi <i,j<qMij and M n := maxi <i,j< q Mij. By the definition of M n and M n , we 
calculate the relative difference of My and Mjj as 


Mjj - Mjj 
Mij 


a 2 (ui ti j) - m 2 Cijj/ni 

i 

° 2 (u2,ij) - m 2 C2, ij/n 2 

v 2 (ui,ij) 

i 

CT 2 (U 2 ,ij) 


log q 


(C.2) 


Therefore, we have |Mjj — Mjj | < Ce n Mjj / log q. which implies that 

| M n - M n | < max | M tj - Mjj| < CM n e n /log q. (C.3) 

l<i ,j<ni 

Combining M n /\ogq = O p ( 1) and e n = o(l), to prove Theorem 2.2 it suffices to show that as 
n, q —> oo, (2.14) holds for any i£l. 

Step (ii). In this step, we use the Hoeffding decomposition (Lemma F.4) to decompose U- 
statistics. We then prove the residual term A n . a jj /(^) is negligible, i.e., to prove the theorem it is 
sufficient to prove (2.17) as n, q —> oo. 

For notational simplicity, we set 


Nij := (uijj - U 2 ,ij)/\Jm?Qi,ij/n\ + m 2 ( 2 ,ij/n 2 - 
Recall that in (2.13) and (2.16) we define and Tij as 

ni ri2 

Z) hij(X a )/m - Z hij(Y a )/ri2 


(C.4) 


71 ( U 1 ,jj U 2,ij) 2 

lj ' rn 2 Ci,ij/ni + m 2 C, 2 ,ij/n 2 


and 


T — 

± i3 ■~ 


a=l 


a=l 


y/ (i,ij/ni + ( 2 ,ij/ri 2 


(C.5) 


By the definition of we have Mjj = () 2 . Combing the definition of Tjj and (2.15), we have 


Na — Tn + 


(m) 41 (m) ^"2 ,ij 

y/m 2 Ci,ij/ni + m 2 ( 2 ,ij/n2 


(C.6) 


Then, we introduce the following lemma to analyze the difference of Njj and Tjj. 
Lemma C.2. As n, q —> oo, we have 


max (Nij ) 2 — max (Tjj ) 2 


o P ( !)• 


(C.7) 


The detailed proof of Lemma C.2 is in the supplementary Appendix D.2. This lemma illustrates 
that maxi <,i,j< q Njj and T n := maxi<j,j< ? Tjj have the same limiting distribution. Hence, to prove 
Theorem 2.2 it suffices to show (2.17) as n, q —> oo. 

Step (iii). In this step, we aim to prove (2.17). In (2.17), T n is the maximum of Tjj over 
S := {(i,j) : 1 < i, j < q} and these Tjj’s are not independent of each other. Therefore, we cannot 
straightforwardly exploit the extreme value theorem under the independent setting to obtain the 
limiting distribution of T n . To solve this problem, we construct normal approximation to obtain 
the extreme value distribution of (Tij)i<ij< q under the setting that Tjj can be dependent of each 
other. The construction of such normal approximation requires most correlations of different Tj 
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to be small. Correlations between different T^’s are related to the correlations of entries of X and 
Y. Assumption (Al) specifies sufficient conditions on the correlations of entries of X and Y. 

To obtain more insight of Assumption (Al), we introduce the following notations. We use Sq 
to denote pairs of (i,j) such that A* and Xj are highly correlated (|iti| > (log </)~ 1_ “°) or Y % and 
Yj are highly correlated (\v,2 t ij\ > (log q) -1 ^ 00 ). In detail, we define 

-So := {(*, j) : 1 <i<q, supp^ao)}, (C. 8 ) 

where we set suppj(ao) as 

suppj(a 0 ) := {l <i<q: \u lyij \ > (log q )^ 010 or \u 2 ,ij\ > (logg) _1_Q °}. 

Assumption (Al) implies that the number of highly correlated (\u a ,ij\ > (logg) _1_Q °) entries 
of X and Y is small. More specifically, Assumption (Al) assumes |5o| = o(q 2 ). 

We can prove that correlations between T^-’s on S \ So are all small. We then use the Boffer- 
roni inequality (Lemma 1 of Cai et al. (2013)) and normal approximation to obtain the limiting 
distribution of ma X(i,j)£S\S 0 (Tij ) 2 so as to prove (2.17). 

We then present the detailed proof of (2.17). Firstly, we prove that it suffices to take the 
maximum of over S\Sq but instead of over S as in (2.17). By setting y q = x+4 log q — log(log q), 
we have 

P ( , rn f x c (^b ) 2 > Vq) - p ( ( Tij ) 2 > y q ) < T( max (T tj ) 2 > y q \ . (C.9) 

v {i,3)eS / \ (ij)G5\5o / 7 

The next lemma implies that, as n, q —> oo, we have P(ma xuj\ e g 0 (Tij) 2 > —> 0. 

Lemma C.3. Under Assumptions (Al) and (A2), as n, q ^ oo, we have 



The detailed proof of Lemma C.3 is in the supplementary Appendix D.3. By Lemma C.3, we 
have P( max(j J ) gSo (Tjj) 2 > —> 0 as n, q —> oo. Moreover, by (C.9), we have that P( max(, :j \ 6S (T,j) 2 > 

y q ) and P(ma: X(ij)£S\S 0 (Tij) 2 > y q ) have the same limit value as n, q ^ oo. Therefore, to obtain 
(2.17), it suffices to prove 

P( max (Tij) 2 — 4 log q + log(log q) < x) exp ( -^= exp(—^)^ , (C.10) 

v {i,j)eS\S 0 J \ Vovr 2 / 

as n, q ^ oo. The problem is then reduced to prove (C.10). 

For simplicity, by rearranging the two-dimensional indices {(i,j) : (i,j) £ S\Sq} in any order, 
we set them as { (ik-,jk) ■ 1 < k < h) with h = \S \ S'o| - If we denote T ^ := Ti k j k , (C.10) becomes 

p( max (T fc ) 2 - 41ogg + log(logg) < x] exp ( -—exp(-^)V (C.ll) 

V 1 <k<h / V y87T 2 / 

Secondly, we exploit normal approximation to obtain the limiting distribution of maxi</ c </j(Tfc) 2 . 

This normal approximation is useful for getting the extreme value distribution of weakly dependent 
data. By excluding all the pairs in So, correlations between 2Vs are all small. Therefore, we can 
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use this normal approximation to get the limiting distribution of maxi<fc</ l (Tfc) 2 . In detail, we 
first use the Boferroni inequality to obtain both lower and upper bounds of P( maxi<fc<ft(Tfc ) 2 > 
y q ). The obtained lower and upper bounds can then be shown to have the same limiting dis¬ 
tribution, which is the extreme value distribution with the cumulative distribution function of 
exp ( — (87 r )^ 1 / 2 exp(— x/2)). 

To describe the procedure of normal approximation, we need some additional notations. We 
introduce 

Z/3,ij = n 2 hij(Xp)/ni for 1 < ft < m, 

Z/3,ij = -hij{Yp_ ni ) for m + 1 < fi < ni + n 2 , 

where h t] is defined in (2.9). Moreover, by the definition of Tjj in (2.16), we have 

ni+n 2 _ 

Tk '■= Ti k j k = ^2 Z/3,i k j k /\Jnl(l t i kjk / n l + ri2(2,ikjk- 

/3=1 


(C.12) 


(C.13) 


After introducing these notations, we explain how to use normal approximation to get the 
extreme value distribution of maxi < k <h(T k ) 2 . Firstly, by the Boferroni inequality (Lemma 1 of Cai 
et al. (2013)), for any integer M with 0 < M < [h/ 2], we have 


2 M 


2 M —1 


E*- 1 )' 


-‘E 


t= 1 l<ki<’”<k£<h 


( n E k ) < P( max (T fc ) 2 > y q ) < (-1) 

\j=l 1 <k<h z —' 


M E 


t=\ l<kx<-<k t <h 


( A E kj), (C.14) 

J =1 


£ 

where we set E k . = {(T k .) 2 > y q }. In the next step, we simplify P( n E k .^. For this, we 


define 


Zpk = Z^ ikjk /(n 2 (i,i k j k /ni +( 2 ,i k j k ) 1/2 and Wp = (Zp kl ,..., Zp ke ) T , (C.15) 


for 1 < k < h and 1 < (3 < n\ + n 2 . Therefore, we have 

ni+n 2 

T kj = {n 2 )~ l ^2 Z/3k r 
0= 1 


Define ||v|| min 


min \vA for any vector v 6 l( With these notations, we rewrite P( n E k .) as 
1 <i<£ x j =1 3J 


p (i, E ‘>)= p (ii"2 1/2 


ni+n 2 

E ^ll min >!/J /2 )- 


P=1 


Secondly, we use a normal vector Ng to approximate n 2 1//2 X)/?= 1" 2 Wp. In detail, we set Ng as 

a normal vector with the same mean vector and the same covariance matrix as n 2 1//2 Wp. 

More specifically, we have 

Ng := {N kl ,..., N k f with E [Ng\ = 0 and Var (Ng) = mVar(Wi)/n 2 + Var(W ni+1 ). (C.16) 
The following lemma uses Ng to rewrite the the upper and lower bounds in (C.14). 
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Lemma C.4. Under Assumption (A2), as n, q —> oo, we have 


2M—1 


•( m a ( T t )2 >!/„)< £ 

i=l l<ki<---<k£<h 

2 M 

'(s, ^m ,) 2 >»,) > £(-i) M E 


AT, 


JV/ 


€=i 




llmin > 2/g /2 - e n (logg) 1/2 )+o(l), (C.17) 
min > y] /2 + e n (log q)~ 1/2 ^j - o(l). (C.18) 


The detailed proof of Lemma C.4 is in the supplementary Appendix D.4. At last, to complete 
the proof, we need to prove that the right hand sides of (C.17) and (C.18) have the same limit value 
1 — exp ( - (\/87 t) exp(—x/2)) as n, q ^ oo. To calculate the limit value, we need the following 
lemma. 


Lemma C.5. Under Assumption (A3), for any integer £ > 1 and r £ 1, we have 

P (ll iV ^llmm > y \ /2 ±e n (logg)- 1/2 ) = -i(— Lexp(-|)) (1 + o(l)). (C.19) 

l<k 1 <...<k e <h ' " v/ ° 7F 

The detailed proof of Lemma C.5 is in the supplementary Appendix D.5. By plugging (C.19) 
into (C.17) and (C.18), we construct the following inequities: 

2M-1 g 

- yq )- g (_1) '' 1 «(^® p( “l ) ) ’ 

2 M g 

for any positive integer M. Letting M —> oo, we prove (C.ll). Therefore, we finish the proof of 
Theorem 2.2. £3 


C.2 Proof of Theorem 2.4 

Proof. In Theorem 2.4, we aim to prove that, under Assumption (A2), as n, q —> oo, we have 


inf P (M n > G (a) + 41ogg — log(logq)) —> 1, (C.20) 

(Ui,U2)gA(4) 


where G (a) := — log(87r) — 2log ( — log(l — a)). In (C.20), we set U a and A(C) as 


Ua — £ 


jxq 


and A(C) = < (Ui,U 2 ) : max 


|^1 ,ij u 2,ij\ 


(*d)eS ^/m 2 Ci,ij/ni + m 2 C t 2,ij/n2 


> C\J\ogq 


where <5 = {(i,j) ■ 1 < i,j < q}. We use A (C) to characterize the set of alternative hypotheses for 
analyzing the power of T a in (2.6). 

To see the meaning of A(C), for any constant C , if one entry of Ui — U 2 has a magnitude large 
enough, we have (Ui,U 2 ) £ A(C). Therefore, small perturbations on a few entries compared to 
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the null hypothesis Ui = U 2 makes the pair (Ui,U 2 ) belong to A (C) for some constant C. In 
Theorem 2.4, we require that (Ui,U 2 ) £ A(4), which implies 


max 


(^1,17 LL 2 ,ij) 


{i,j)eS m 2 Ci : ij/ni + m 2 C, 2 ,ij/n 2 


> 16 log q. 


(C.21) 


Under the alternative hypothesis, u 1 _ tJ = u 2t ij cannot hold for all (i,j) £ S. This motivates us to 
define 


M 3 : = max 


('LL 1 .ij LL 2 ,ij 'LL] .ij T Ll 2 jj ) 


('LL \.ij Ll 2 jj) 


(i,j)eS d 2 (ui tij ) + v 2 (u 2 ,ij) 


and M n : = max Ma =max 

(ij)eS (i,j)€S cr 2 (ui,n) + a 2 (u 2ij ) 


(C.22) 


because and M n are different under the alternative hypothesis. 
To prove (2.17), using the inequality (a ± ft ) 2 < 2a 2 + 26 2 , we get 


(^1 .ij LL 2 ,ij)“ A 2('Ui ) jj LL 2 jj T U 2 ,ij) + %(ui,ij L-L J '2,ij ) • 

Therefore, by the the definitions of M n and in (C.22), we get 


max 


(UMj ' “ 2 « )2 < 2 M\ + 2M n . 


(i,j)£S a 2 (ui tij ) + a 2 (u 2jij ) 


(C.23) 


Under (Ui,U 2 ) £ A(4), to prove P (M n > G (a) + 41og(/ — log(logg)) —> 1 , we need the 
following two lemmas. 

Lemma C.6. Under Assumption (A2), as n, q —> 00 , we have 


1 


P (M„ < 4 log q - - log(log q)) 


1 . 


(C.24) 


The detailed proof of Lemma C .6 is in the supplementary Appendix D. 6 . 
Lemma C.7. Under Assumption (A2), as n, q —> 00 , we have 

('LL 1 jj LL 2 ij ) 


P( max 


C i,j)es d 2 (ui :ij ) + d 2 (u 2 ,ij) 


> 16 log g —> 1 . 


(C.25) 


uniformly for all (Ui,U 2 ) £ A (4). 

The detailed proof of Lemma C.7 is in the supplementary Appendix D.7. 

Combining (C.23), (C.24) and (C.25), as n,q —> 00 , with probability going to one, we have 


M n > - max 


('LL 1 .ij LL‘2 : ij ) 

2 (ij)GS a 2 (ui tij ) + a 2 (u 2 j ij) 


1 


- Ml > 4 log q + - log (log q) 


Therefore, as n, q —> 00 , we have 

1 > P (M n > G~(a) + 41og q- log(logg)) > P (M n > 4 log <7 + ^ log(logg)) ->■ 1. 

Hence, we finish the proof of Theorem 2.4. □ 
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C.3 Proof of Theorem 3.2 


Proof. In Theorem 3.2, we aim to prove (3.10) and (3.11). (3.10) is an application of Theorem 
2.2 for Kendall’s tau matrices. Therefore, we only need to prove (3.11). (3.11) is the same as 
(3.10) except that in (3.11) we use (' f ai ij) (defined in (3.5)) to replace the Jackknife variance 

estimator d 2 (r a jj) in (3.10). 

We prove (3.11) in the same three steps as those for Theorem 2.2 except that we need to verify 
whether <?pi ug {r a ,ij) satisfies the following equation: 

^( l <l<f<d I ria ^P lu s^ a -yj ) “ 4 C«wl > = °( 1 )> ( c - 26 ) 


where e n = o(l). (C.26) is the same as (C.l) except that in (C.26) we use Vp\ ng (ja,i]) to replace 
& 2 (T a ,ij) in (C.l). Once (C.26) is obtained, by the same argument in the first step of the proof of 
Theorem 2.2, we obtain |Mn’ plug — max(jj) e g Mij\ = o p ( 1), where M^’ plug and M %3 are defined in 
(3.6) and (2.13). Following the last two steps of the proof of Theorem 2.2, we obtain (3.11). 

The proof of (C.26) proceeds in two steps. We define H C c,ij and II c ,ij in (3.4) and (3.2). In the 
first step, we rewrite (C.26) as 


( \<l<f<d ((^ cc -*j (n c ,ij) ) (n cc ,ij (n C) ij) ^ \ 0 „ d) ~ 


log dJ 


(C.27) 


where II cc jj and Ii c jj are U-statistics and they estimate II cc ,ij and II c jj. In the second step, we 
obtain upper bounds of maxi<j<j< d \U CC:ij - U CCtij \ and maxi< i<3 < fi |(II C)ij ) 2 - (U C)ij ) 2 \. Using the 
obtained upper bounds, we prove (C.27). 

Step (i). For notational simplicity, we ignore the subscript a (a = 1 or 2) for X and Y. To 
rewrite (C.26) as (C.27), we need 4 Qj = 16(11 CC)i j - (U C)lJ ) 2 ) and a 2 lug (%) = 16(n CCjij - fl 2 cij )/n. 
By the definition of a 2 lug (%) in (3.5), we have a 2 lug (%) = 16(II CC) jj — Ii 2 ci -)/n. We then need to 
prove AQj = 16(II CC) jj — (II c ,«j) 2 )- By (3.3), we have that the variance of %j is 

t T\n c ij(f- — n c,zj) +16 r(n cc ij — 

n(n — 1) nn — 1 ' J 

Therefore, 16(II CC) jj — ( Yl c ,ij ) 2 ) is the asymptotic variance of sJnTij as n —> oo. Lemma C.l implies 
that 4 Qj is also the asymptotic variance of \fnfij as n —> oo. Hence, we have 

i6(n cc ,y - (n c ,ij) 2 ) = 4 Cn, (c. 28 ) 

because both sides of (C.28) are the asymptotic variance of s/nfij. Therefore, combining d 2 Au „.(%) = 
16(n cC) jj — n 2 i j)/n and (C.28), to prove (C.26), it suffices to prove (C.27) as n, q —> oo. 

Step (ii). We aim to prove (C.27), in which H cc ,ij is a U-statistic estimator of Ti C c,ij (defined 
in (3.4)). Therefore, we use the exponential inequality (Lemma F.2) to obtain 


J max 
V 1 


in, 


CC,IJ 


-n 


CC,IJ I 


> t) < C\d 2 exp(— C’ 2 nt 2 ). 


(C.29) 


n c ij is also a U-statistic which estimates n c jj (defined in (3.2)). Similarly to (C.29), we use the 
exponential inequality (Lemma F.2) to obtain 


J max 
V 1 


n C jj - n 


C,IJ I 


> t) < C\d 2 exp(— C- 2 nt 2 ). 


(C.30) 
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By the definitions of II C) y and II Ci y, we have |II C| jj| < 2 and |II Ci ij| < 2. Therefore, we have 
|II Ci ij +H c ,ij\ < 4., which, combined with (C.30), implies that 

P(" max |(n Cjii ) 2 - (n Ciij -) 2 | >t) <F( max 4|II Ci jj — II C jj| > t) < C\d 2 exp(— C 2 nt 2 ). (C.31) 

Considering log d = 0(n 1 / 3-e ) (see Assumption (A2)), (C.29) and (C.31), by setting e n = l/(logd) K ° 
with kq > 0 sufficiently small, we have 


) ( max III cc,ij II cc,ij I 

V 1 


> c 


V 1 

By the triangle inequality, we have 


\ 

logd/ 

= o(l), 

(C.32) 

\ 

logd/ 

= 0 ( 1 ). 

(C.33) 


P[ max 
V l<i<j<d 


((n cc ,ii - (n c ,ij) 2 ) - (n CC)i j - (n Cji .,) 2 )) 


\ (< £n 

> L log d 


“ l^ cc ’*' 7 ^ cc ’bl > 0-5C'i ( ^ d )+p(^^^max^^|(II Cj jj) 2 (n C; jj) 2 | >0.5(7^”^. 

Combing (C.32), (C.33) and (C.34), we prove (C.27). This completes the proof. 


(C.34) 


n 


C.4 Proof of Theorem 3.3 

Proof. By setting 

( r l,t 7 — r 2 ,ij) 2 


M r - ps ■= 

i] ^2 


and Mf’ ps := max M[- ps = max „ —, (C.35) 


<P >1 + is /™ 2 n ■ W)eS ij (^Sal ps /n 1 + al ps /n 2 

where S = {(*, j) : 1 < * < j < d}, we aim to prove that as n, d —>■ oo, we have 

P (Mf ,ps — 4 log d + log(log d) < x) —> exp ( -^ exp(——)^. (C.36) 

' v87t 2 / 

(C.36) is the same as (3.10) except that in Mn ps we use cr 2 ps /n a to replace d 2 (T at ij) in (3.10). 

We first present the sketch of the proof. For notational simplicity, we introduce 


(C.37) 


glj(X Q ) := E[sign(X ai - W^)sign(X aj - X^)\X a ], 
gIj(Y a ) := E[sign(F Qi - F^sign^- - Y Pj )\Y a ], 

with (3 7 ^ a. The proof proceeds in three steps. In the first step, we use the Hoeffding method 
(Hoeffding, 1948) to decompose Kendall’s tau as 

ni n 0 n 2 „ 

hijiXa) + ——yy A n 2 ) jj, (C.38) 

a =1 ' ' 


2 2 2 
n,ij = — h ij( x a ) + - ^ A n u ij and r Xij = — 


n i 


Q! — 1 


m(m-i) ni ’ lJ 


n 2 


where /iT(X a ) := gJ^Xa) - n^-, /i[ ? (F a ) := glj{Y a ) - T 2>ij and 




V 


A ni« = H (sign(A fci - A fi )sign(A' fci - J£y) - n.ii - ^(A fc ) - /i!-(X<)), 

l<fc<£<ni 

A n 2 ,b' = 5Z (sign(Tfei - Ffi)sign(y fci - Y^-) - ~ {Y k ) - hfj (Yf )). 
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2 Sa= i hjj (X a )/n\ and 2 W, dj) O^a) /'U 2 are terms for the sum of i.i.d random variables and 
2A^ a ij/n a (n a — 1) is a small residual term. Similar to (C.5), we define Ty as 

ni ri2 

Tij ■= ( h ij( X a)/m - Kj{Y a )/n 2 )/^ <r? >ps /4ni + a\ v J\n 2 . 

a=l a=l 


We then prove that the residual term A^ a i ■/ /n a {n a — 1) is negligible, i.e., to obtain Theorem 3.3, 
it suffices to prove that as n, d —> oo, we have 

p( max (Tjj ) 2 — 41ogd + log(logd) < x\ —> exp ( --^=exp(— ^)Y (C.39) 

\{i,j)eS / V v 8 vr 2 / 

In the second step, we prove that it is sufficient to take the maximum of (Tjj ) 2 over S\Sq but 

instead of over S as in (C.39), i.e., to obtain Theorem 3.3, it suffices to prove that as n, d —> oo, we 

have 

pf max (Tjj ) 2 — 41ogd + log(logd) < x) —> exp ( -^ exp (——)') . (C.40) 

\(i,j)eS\S 0 ' V V87t 2 / 

In the last step, we prove that it is sufficient to replace Tjj in (C.40) with Tjj, i.e., to obtain 
Theorem 3.3, it suffices to prove that as n, d —>■ oo, we have 

p( max (Tjj ) 2 — 4log d + log(log d) < x) exp ( --j*=exp(-^)V (C.41) 

\(i,j)eS\So ' ^ V 87t 2 / 

where Tjj and So are defined in (C.5) and (C. 8 ). Because (C.41) and (C.10) are the same, we prove 
(C.41) by following the same proof of (C.10). Hence, we complete the proof of Theorem 3.3. We 
then present the proof in detail. 

Step (i). For notational simplicity, we introduce 

A T .. A r • ■ 

n i,ij ri2 

A^ ps := -j ==£ ? <J ' ~ ?2 ’ ij = % + Wjj with % := n2 ^ 2 ~ 1} . (C.42) 

\J Hps/"! + a i,ps/ n 2 \J ps/ 4n l + <72,p8> 4n 2 

By (C.35) and (C.42), we have MP ps := (IVT ps ) 2 . We then introduce the following lemma to finish 
the proof of this step. 


Lemma C.8. As n, d —> oo, under Assumption (A2) we have 

I i <?<?</ JV « , ” )2 ^ i = °- m ■ 


(C.43) 


The detailed proof of Lemma C.8 is in the supplementary Appendix D.8. By Lemma C.8, we 
obtain that maxj<j<j<d(AC- ,ps ) 2 and maxi<j<j<^( T VJ ) 2 have the same limiting distribution as n, 
d —> oo. Therefore, for obtaining Theorem 3.3, it suffices to prove (C.39) as n, d —> oo. 

Step (ii). In the second step, we aim to prove that (C.40) is enough to prove the theorem. By 
setting yd = x + 4 log d — log(log d ), we have 


max (T j 

(ij)es 3 


) 2 >yA- 


M 


max (Tjj) 2 > y d 
(i,j)eS\So 


< 


max (Tjj) 2 > y d 
(jj)eSo 


We then introduce an additional lemma. 
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(C.44) 


Lemma C.9. Under Assumptions (Al) and (A2), as n, d —>■ oo, we have 

F ( ..“.ax (Tijf > Vd) = o(l). 

The detailed proof of Lemma C.9 is in the supplementary Appendix D.9. By Lemma C.9, we 
obtain that as n, d -> oo, P^max(j t j)es(^ij) 2 > Udj and P^ max( ? ;)es , \s 0 ( Tjj ) 2 > y^j have the 
same limit. Hence, we complete the proof of the second step. 

Step (iii). In this step, we aim to prove that it is sufficient to have (C.41) as n, d —» oo. For 
this, we need the following lemma. 

Lemma C.10. If X and Y follow the meta-elliptical distribution, we have 


l 4 C a,ij ^a.psl — C\r a jj\, 


where C is a constant irrelevant to i or j. Definitions of Q a jj and c r ajPS are in (2.10) and (3.7). 


The detailed proof of Lemma C.10 is in the supplementary Appendix D.10. By the definition 
of Sq in (C. 8 ), for any (i,j) G S \ So, we have |r 0 ) jj| < (logd)" 1_Q °. By Lemma C.10, we have 
l 4 Co,ij-0'a,psl ^ C\T a ,ij\. Hence, for (i,j) G S\S 0 , we have \^Ca,ij~^l, ps \ < C\r a ,ij\ < C'(logd) -1 ^" 0 . 
Considering ( a jj > r a > 0 (see Assumption (A2)), for (i,j) GS \ Sq, we obtain 

ka, ps / 4 Ca,ii - l| < C^logd ) -1- " 0 and \4( a ,ij/Va, ps - l| < C^logd) -1- " 0 . (C.45) 


To prove the sufficiency of (C.41), we need to prove | (T t j) 2 — max^j\ eS \ So (Tij) 2 

o p (l). For this, we calculate the relevant difference of (Tij) 2, and (T t j) 2 as 


(Tij) 2 


< 


a 


l,ps 


a 


4 Cl ,ij 

2 

l,ps 


+ 


°2,ps 4(2,ij 


a 


2 

2 ,ps 


By (C.45), we then have |(Tjy ) 2 — (Tjj) 2 \ < C(Tij) 2 (log d) 1 “° for (i,j) G S \ So- Hence, we have 


max (Ta) 2 — max (XL ) 2 
(i,j)eS\S 0 (i,j)eS\S 0 


< max 
(i,j)eS\S 0 


(%) 2 - (Tjj) 2 < C^logd)" 1 "*" max (Tij) 2 . 

(i,j)£b\b 0 


Considering max (ij)e S \ So (Tij) 2 = O p (logd), we have | max (ij)eS \ So (f ii ) 2 - max (ij)e 5 \ 5 o (Ty) 2 | = 
o p (l). Therefore, it suffices to prove (C.41) as n, d —> 00 . Because (C.41) and (C.10) are the same, 
we prove (C.41) by following the same proof of (C.10). Hence, we prove Theorem 3.3. □ 


C.5 Proof of Theorem 3.4 


Proof. To obtain Theorem 3.4, we aim to prove (3.13), (3.14) and (3.15). The proof of (3.13) is 
the same as Theorem 2.4, as Kendall’s tau is a special kind of U-statistic with a bounded kernel. 
Hence, we only need to prove (3.14) and (3.15). 

First, we prove (3.14). Under the alternative hypothesis, T\ jj = T 2 jj cannot hold for all (i,j) G 
S, where S = {(i, j) : 1 < i < j < d}. This motivates us to define 


M 1,plug 

±v± n 



T T~2,ij) 

+ ? plug(^>v) 


M T,plug , = 

±v± n 


max ^— 
(i,j)GS a 2 lug 


(l~l,ij ,ij) 

(n,ij) + °l lug (T2,ij) ’ 


(C.46) 
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because Mn’ plug and Afn’ plug are different under the alternative hypothesis. Similarly to Theorem 
2.4, to prove (3.14), using the inequality (a ± b ) 2 < 2a 2 + 2b 2 , we obtain 

(Tl ,ij — 2(7“ 1 jj Tl ,ij T T2,y) T 2 (t"} jj T 2 .?y) ■ 

By the the definition of M,(’ plug and Mn’ plug in (C.46), we get 

(U,jj ^~2.ij ) 


max 


(ij)es 5; lug («i,ij) + ^(^ 2 ,^) 


< 2M ? ( ,plug + 2M„ T ’ plug . 


(C.47) 


To complete the proof of (3.14), we need two additional lemmas. 

Lemma C.ll. Under Assumption (A2), as n, d —> oo, we have 

P(Af*’ plug < 41ogd— ^log(logd)) —> 1. 

The detailed proof of Lemma C.ll is in the supplementary Appendix D.ll. 
Lemma C.12. Under Assumption (A2), as n, d —> oo, we have 

( T M j ~ T 2,ij) 2 


(C.48) 


max 


(ij)es ff; lug (fi,ij) + 5 2 lug (r 2 i y) 


> 16 


log dj 


1 . 


(C.49) 


uniformly for (U^U^) G U(4). 


The detailed proof of Lemma C.12 is in the supplementary Appendix D.12. Combining (C.47), 
Lemmas C.ll and C.12, under (U[,U 2 ) G U(4), as n, d —> oo, with probability going to one, we 
have 


M 


r,plug 


1 

> - max 


( r i ,ij T 2 ,ijY 


2 (i,j)es u 2 lug (r i tij ) + a 2 lug (r 2 ,y) 


— Af*’ plug > 41ogd+ ^ log(logd). 


Therefore, under (UJ, UJ) G U(4), as n, d —>• oo, we have 

1 > P(A/)(’ plug > G~(a ) + 41ogd — log (log d)) > P(Af^ ,plug > 41ogd+ ^log(logd)) -> 1, 

where G~(a ) := — log( 87 r) — 2 log ( — log(l — a)). Hence, we prove (3.14). 

Secondly, we aim to prove (3.15). Under the alternative hypothesis, t\ j 3 = r 2 .ij cannot hold for 
all (i,j) G S. This motivates us to define 


Ml ,ps := max 
(ij)es 


(T Uj ?2,ij Tim + ^ M r,p S . = max (Tl,<i f 2 ,ii ) 2 


0 - 2 /m + 0-2 /n 2 


(ij)£5 0-2 /m + 0-2 /n 2 ’ 


(C.50) 


as Mn ,ps and M„’ ps are different under the alternative hypothesis. 

We then introduce an additional lemma. 

Lemma C.13. Under Assumptions (Al) and (A2), as n, d —> oo, we have 


P (M^ ,ps < 41ogd-log(logd)) —»■ 1. 


(C.51) 


31 







The detailed proof of Lemma C.13 is in the supplementary Appendix D.13. To prove (3.15), 
using (a ± 6) 2 < 2a 2 + 2b 2 , we obtain 

(Tl ,ij 72,ij) d 2( T\^ij Tl ,ij T T 2 (ti yj T 2 jjj) • 

By definitions of Mf’ ps and M^’ ps in (C.50), we have 


max „ — T2 ^, < 2M^ + 2M^ ps . 


(*d)6S af ps /m + u-2 pg /n 2 
Under (U^ILJ) G V(4), we have 

( T Mj — T 2,ij) 2 


(C.52) 


max 


d,i)es o-f !ps /ni + cr| ps /n 2 


> 16 log d. 


(C.53) 


Combining Lemma C.13, (C.52) and (C.53), under (U[, UTi) G V(4), as n, d —>• oo, with probability 
going to one, we have 

MZ ps > \ max 2 (Tl / J '~ T2 f )2 , - A^’ plug > 41ogd + \ log(logd). 

2 (ij)es <Ti iPS /ni + (r 2 , ps /n 2 2 

Therefore, under (UJjUJ) G V(4), as n,d —> oo, we have 

1 > P(M,'(’ ps > G~(a ) + 41ogd — log (log d)) > P(M 7 (’ ps > 41ogd + ^ log(logd)) — > 1, 

where G _ (a) := — log(87r) — 2 log (— log(l — a)). Hence, we prove (3.15). This completes the proof 
of Theorem 3.4. □ 


C.6 Proof of Theorem 3.5 

Proof. It suffices to take T a to be the set of level a tests over the normal distributions with 
covariance matrix S, where Diag(S) = I<f, since it contains all the a-level tests over the collection 
of the assumed distributions. For these normal distributions, we define 

H(c') = {(Si, S 2 ) : a ljU = 1 ,a 2 ,ii = 1, H^i ~ S 2 || max > d yj\ og d/n). 

By a ij = sin(rjj7r/2) in Theorem A.3, we have |<7i — cj 2 y| < | T\ — r 2! ,y |tt/ 2. Therefore, for any 
d, there is a Co such that 11(0') C U(co). For simplicity, we set n\ = n 2 = n. 

Let’s consider the Gaussian setting and define 

J-'(p) = {S = Id+pe\eJ+pejeJ , where = (0, ..., 0, 1, 0, ..., 0) for 1 < k < d, and j = 2, ..., d }, 

fc-i 

where p = d log d/n and p p is the uniform measure on J~(p). For simplicity, under the null 
hypothesis, we set Si = S 2 = 1^. Under the alternative hypothesis, we set Si = S ~ p p and 
S 2 = Irf. Therefore, under the alternative hypothesis, there is a Co such that we have (Si, S 2 ) G 
H(c') C U(cq). 
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Let Ps denote the probability measure of samples for X ~ N(0, Si) and Y ~ 1V(0, S 2 ) with 
Si = S and S 2 = I d- We set P Alp = /Psd/ip(S)- In particular, let Po denote the probability 
measure of samples for X ~ IV(0, Si) and Y ~ 1V(0, S 2 ) with Si = S 2 = 1^. We then have 

inf sup Ps(T a = 0) > 1 - a - sup |Pp p (A) - P 0 (A)| > 1 - a --||P Mp - P 0 ||tv } 

T a GT a SeJ'(p) A:P 0 (A)<a 1 

where || • \\tv denotes the total variation norm. By setting 

L ^ z):= Piy <z> ’ 

considering the Jensen’s inequality, we have 

II1PW - PoIItv = J \L, p (z) - l|dP 0 (z) = Ep 0 | L, P (Z) - 1| < (|Ep 0 Lj p (Z) - 1|) 1/2 . 


Therefore, as long as Ep 0 L 2 ^( Z ) = 1 + o(l), we have 

inf sup Ps(T a = 0) > 1 — a — o(l) > 0, 

T a &T a seJ'(p) 

which is the desired result. We then aim to prove Ep 0 L‘j lp (Z) = 1 + o(l). By construction, we have 

EeJ'(p) \i =1 11 v 7 / 

where Cl := S _1 and {Z;} are independent random vectors with Zi = (Zn, , Zid) T ~ 

A/"(0,Irf). Therefore, we have 


E Po L p P = 


(d~ l) z 


where Cli = S i 1 for i = 1,2. By setting 


S E flIlV^r? J iI72 ex pf-5 z - T ( f! i + ^- 2I -<) z .)'). (C- 54 ) 


s 1 ,s 2 eJ'(p) 


“,1 |SiI 1 / 2 |S 2 |i/ 2 


A = 


1 — p 2 


2 p -1 -1 

-1 p 0 


B = 


2p ( p -l 


-10 p 

z i,{i,2} = (Zn, Zi 2 ) T and Z. ^ 2 ^} = (Zn, Z i2 , Z i3 ) T , we have 


1 - P V - 1 P 


E Po L l P - JZTl n (i _p2 EeX P ( - 2^S 1 ’ 2 >3} AZ< ’{1’ 2 ’3})) 

---V-' 

As 

+ (rw^ Eex p (“ 2 z f{b 2 } BZi ’ {i , 2 } ))> 

"-----V-^ 

A 4 
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where A 3 represents the sum of terms with Si 7 ^ S 2 in (C.54) and A 4 represents the sum of terms 
with Si = S 2 in (C.54). For A 3 , by the standard argument in calculating moment generating 
function of the Gaussian quadratic form (Baldessari, 1967), we have 

A3 = , 7T1' (i -V). «! + MA))(i + a 2 (A))(i + a 3 (A)»-»/’ . 

Moreover, we have (1 + Ai(A))(l + A 2 (A ))(1 + As(A)) = | A + 13 1 = (1 — p 2 )~ 2 ■ Therefore, we have 
A 3 = (d—2)/(d— 1) = l+o(l). For A 4 , it is easy to get Ai(B) = 2p /(1 — p) and A 2 (B) = —2 p/ (1 +p). 
Accordingly, similarly to the calculation of A 3 , we have A 4 = (d — l )^ 1 • (1 — p 2 )~ n ■ Considering 
p = c'y/log d/n, as long as d < 1 , we have 

A 4 < ^ • (1 — d 2 log d/n)~ n = (d — 1) _1 exp(c /2 logd)(l + o(l)) = o(l), (C.55) 

as n, d —> 00 . Combining A 3 = l + o(l) and (C.55), we prove Ep 0 L‘j lp (Z) = l + o(l). This completes 
the proof. Q 


C.7 Proof of Theorem 3.6 

We first introduce two additional lemmas. Given these lemmas, we prove Theorem 3.6. 
Lemma C.14. Suppose that {X\, X 2 , X 3 , AA) T ~ A(0, Sf u n) is Gaussian distributed with 


Elfuii = 


1 q 1 
Q S 


and Diag(S) = I 3 , 


(C.56) 


where q = (pi, P 2 , P 3 ) T ■ We have 

|:E(4.(Ai)-1/2)($(A 2 )-1/2)($(A 3 )-1/2)(4»(A4)-1/2)| < (^ + Vq^^q, (C.57) 

where <!>(•) is the cumulative distribution function of a standard normal distribution. 

The detailed proof of Lemma C.14 is in the supplementary Appendix D.14. 

Lemma C.15. Suppose that (X\, X 2 , X 3 , X 4 ) 1 - ~ N(0, Eif u n) is Gaussian distributed with 


' 1 

Pi 

01 

02 

pi 

1 

03 

CI4 

ai 

03 

1 

P2 

. a 2 

04 

P2 

1 


Then, when |pi|, \p 2 \, |ai|,..., 1 0 , 4 1 < r < 1, we have 

C r ■■= sup |Corr{($(AT) - 1/2)($(A 2 ) - 1 / 2 ), ($(X 3 ) - 1/2)($(A 4 ) - 1 / 2 )}| < 1 . 

|pi|,|P2|,|ai|,-,|«4|<r 

Moreover, we have C r = 1 only when r = 1 and the set {pi, P 2 , a 1 ,, 04 } attains the boundary. 
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The detailed proof of Lemma C.15 is in the supplementary Appendix D.15. 


Proof. To introduce the set Bq and its properties, we set 


C 0 = {(i.j) : i € f!(r)ljr} U««) : i 6 n MlJ r } and Bq — 5 0 (J^o, 

where H(r), T and Sq are defined in (3.17), Assumption (A4) and (C.8). For Kendall’s tau matrices, 
we set S = {( i,j ) : 1 < i < j < d} and q = d. By the definition of Bo, we have |r Q) ^| < r < 1 for 
any k / £ G {ii,ji,i 2 , J 2 }, where (h,ji) G S\B 0 and € S\B 0 . For any (i,j) G S\B 0 , we 

also have |r a .^| < (logd) -1- " 0 , as So is a subset of Bo- By Assumptions (Al) and (A4), we have 
I-Bo | = o(d 2 ). 

We then prove Theorem 3.6 similarly as the proofs of Theorems 3.2 and 3.3 except that we 
replace So with B$. However, as we don’t require Assumption (A3) in Theorem 3.6, we don’t have 
Lemma C.5. Therefore, we only need to prove (C.19) under Assumptions (Al), (A2) and (A4). 

We then begin to prove (C.19) under Assumptions (Al), (A2) and (A4). As we use Bq 
to replace So in the proofs of Theorem 3.2 and 3.3, we need to redefine some notations. After 
rearranging the two-dimensional indices {(i,j) : (i,j) G S \ Bo} in any order, we set them as 
{{fkijk) '■ 1 < k < h} with h = |S \ Bo|. We denote Tj. = Ti k j k , where the definition of Tij is the 
same as (C.5) except that we use Kendall’s tau as the U-statistic in (C.5). We only need to check 
whether (C.19) is correct for { {ikijk) ■ 1 < k < h} under Assumptions (Al), (A2) and (A4). The 
definition of Nf is same as (C.16). By Nf s definition in (C.16), the entry in the a-th row, 6-th 
column of Var(lV^) is 


n 2 Cov (h ikJka (Ai), hi k j (Xi))/m + Cov (, h ikJka (Yi), h ik j (Yi)) 


\J n ‘2Cl,i ka j ka / n l + Ca,i ka j ka \J n 2(,l,i kb j kb /ni + C 


(C.58) 




where h l3 is defined in (2.9). Apparently, when a = 6 , (C.58) equals one. 

The proof of (C.19) proceeds in three steps. In the first step, for any (k,£) G5 \ Bo, we 

dehne the condition (*). This condition specifies that, for any (i,j), (k,£) G S \ Bo, there exits 
an i\ G {i,j,k,£} such that for any j\ G {i,j,k,£} \ i\, we have |= 0((logd) _1 ^“°). In the 
second step, we prove that for {i,j), ( k ,£) G S \ Bo satisfying (*), as n, d —> 00 , we have 


n 2 Cov (hjj(Xi), hkj{X\)) /n\ + Cov (hjj(Yi), h H {Yi)) _ 
y/ «2Cl,' ij / n l + C 2 ,ij^ri2(l,ke/ni + C,2,ki 

In the third step, we prove that for (i, j), (k,£) G S \ Bq dissatisfying (★), we have 


n 2 Cov (hjj{Xi), h k i(Xi))/m + Cov 

\/ n 2Ci,ij/ni + C2,ijy/n.2(i t ke/ni + C2,H 


< C < 1. 


(C.59) 


(C.60) 


as n, d —> 00 . 

By the proof of Lemma 5 in Cai et al. (2013), (C.59) and (C.60) are sufficient conditions for 
obtaining (C.19). Therefore, we prove (C.19) which finishes the proof of Lemma 3.6. 

Step (i). In this step, we define the condition (*). For this, we set graph Gijkt = {VijU, Eijkg), 
where V l3 ki = {*, j, k, £} is the set of vertices and Eijki is the set of edges. We say that there is 
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an edge between a / 6 6 {i, j, k, i } if and only if |r a &| > (logd)~ 1_Q °. If the number of different 
vertices in Vijki is 3, we say that Gijki is a three vertices graph (3 — G). If the number of different 
vertices in V t j^ is 4, we say that is a four vertices graph (4 — G). If there is no edge connected 
to a vertex in G t j^j, we say that it is isolated. For any 1 < k a / kb < h, we have that G lka j ka i k ^j k ^ 
is either 3 — G or 4 — G. For any 1 < k a / kb < h, we say Q := Gi ka j ka i k j kb satisfies (*), if a graph 
Q has the property: 

(*) If Gi ka j ka i kb j kb is 4 — G, there is at least one isolated vertex in Gi ka j ka i k j kb \ otherwise 
Gi ka j ka i kb j kb is 3 — G and E ik j kaikb j kb = 0. 

Step (ii). In this step, we prove that for (• i,j ), (k,£) E S \ Bo satisfying the condition (*), 
(C.59) is correct. To prove (C.59), by Q a _ij > r a > 0 (see Assumption (A2)), it suffices to prove 

Cov (hij(X i),h H (Ai)) = O^logd)- 1 - 00 ) and Cov (h ij (Y l ),h k tO' 0) = 0((logd)" 1 " 00 ). (C.61) 

For simplicity, we only show the proof of X . We treat Kendall’s tau as a special kind of U-statistics. 
By (2.9), we have hij(X i) = gij(X i) — T\ ^. Therefore, we have 

Cov (hij(Xi), hu{X i)) = E[g ij (Xi)g ke (X 1 )\ - (C.62) 

By (i,j), (k,£) E S\Bq, we have 

n,ij = O((logd) _1_ao ) and ti,h = 0((log d) _1_Q °). (C.63) 

Combining (C.62) and (C.63), for getting (C.61), it suffices to show 

E[g ij (X 1 )g ke (X 1 )} = O^logd)- 1 -* 0 ). (CM) 

To prove (C.64), by the definition of g tJ in (C.37), we know that for Kendall’s tau, we have 
gij(X r) = F((X ll -X 2i )(X lj -X 2j ) > 0\X li ,X lj )-¥((X li -X 2 i )(X lj -X 2j ) < O^Xy). (C.65) 
Genz and Bretz (2009) show that for a two-dimensional normal vector (N\, N 2 ) T ~ IV(0, E) with 


E = 


1 

P 


P 

1 


we have 


l fS in 1 (p) 

P(Ni > ai,N 2 > a 2 ) = $(—ai)$(—a 2 ) + — / exp 

^ Jo 


2 cos 2 (0) ’ v ' 


Noticing that X\ and X 2 follow the Gaussian copula distribution, by (C.65), (C.66) and Theorem 
A.3, we have 

9 ij(Xi) = (2Fi(X u ) - l)(2Fj(X lj ) - 1) + 0(r hij ), (C.67) 

where F t is the cumulative distribution function of X\ t . Combining (C.63) and (C.67), to obtain 
(C.64), we only need to prove 


2 i E[(F i (X li ) - 1/2) {FjiXxj) - 1/2) (F k (X lk ) - 1/2) (F e (X u ) - 1/2)] = O((log d)- 1 ^ 0 ). (C.68) 
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r \j 


As X belongs to the Gaussian copula family, for simplicity, we assume that {Xu, X\j, X\ k , X\p) T 
N(0, Sfuu) is Gaussian distributed with 


Diag(E) = I 3 and £ fu n = 


1 g T ' 
e s _ 


Accordingly, we have i'j(-) = T (•) for t = 1 ,,d, where <!>(•) is the cumulative distribution function 
of a standard normal distribution. Hence, to obtain (C.68), it suffices to prove that as n,d —> 00 , 
we have 


E[($(Xu) - \/2){<$>{X lj ) - 1/2) ($(X lfc ) - 1 / 2 ){HX U ) - 1 / 2 )] = 0((logd)- 1 "* 0 ). (C.69) 

If Gijki is 4 — G and satisfies (*), without loss of generality, we suppose that X\ t is the isolated 
vertex. Hence, by Theorem A.3, we have ||(?||2 = 0((logd)^ 1_ “°). Noticing that the smallest 
eigenvalue of any 4 by 4 principal sub-matrix of £ a is bounded away from 0, by Lemma C.14, we 
have (C.69). Hence (C.59) is correct. If Gijkt is 3 — G and satisfies (*), we have (C.69) similarly. 
This completes the proof of Step (ii). 

Step (iii). For (i, j), (k, £) £ S \ B 0 , to get (C.60) as n, d —> 00 , we only need to prove 

|Corr(^(Xi), huiX^) \ < C < 1. (C.70) 

By (C.67) and (C.63), to prove (0.70), we only need to prove that as n, d -» 00 , we have 
|Corr((F i (A li ) - 1 / 2 )(F J (X li ) - 1 / 2 ), (F k (X lk ) - 1 / 2 )(F e (X u ) - 1/2)) | < C < 1. 

For simplicity, we suppose {Xu, Xy, X\ k , Xu) T ~ N{ 0, Sf u u) is Gaussian distributed with 

Diag(E) = I 3 and S full = [ 1 Q 1 . 

.8 ^ 

We then need to prove that as n, d —> 00 , we have 

|Corr((4>(A li ) - l/2)($(Xy) - 1 / 2 ), (4>(A lfc ) - 1/2){<S>{X U ) - 1/2)) | < C < 1. (C.71) 

By Lemma C.15, we have (C.71). Therefore, we prove (C.60). By following the proof of Lemma 5 
in Cai et al. (2013), we prove (C.19) under Assumptions (Al), (A2) and (A4). This completes 
the proof of Theorem 3.6. □ 


C.8 Proof of Theorem 2.6 and Remark 3.8 

Proof. The proof of Theorem 2.6 is similar and simpler than the proof of Theorem 2.2. The proof 
of Remark 3.8 is similar to the proofs of Theorem 3.2, 3.3 and 3.6. Due to the close similarity, the 
proof is omitted. □ 


D Proofs of Lemmas of Appendix C 

In this appendix, we present proofs of lemmas in Appendix C. In the sequel, we use C, C\, C 2 , ..., 
to denote constants that do not depend on n, d, q and can vary from place to place. 
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D.l Proof of Lemma C.l 


Proof. By Lemma F.3, we have that m 2 Ca,ij is the limiting variance of as n, q —> oo. 

Combining Lemma F.3 and the definition of hij in (2.9), we also have that Ci,ij and C 2 ,ij are 
variances of hij{Xf) and hij{Y(). 

In Lemma C.l, we aim to prove (C.l). For simplicity, we only provide the proof of X. By the 
definition of Jackknife variance estimator (T 2 (u a ,ij) in (2.5), we rewrite (C.l) as 


PI max 
V 1 


m 2 (ni-l)^V 

(m-m) 2 Ul ’ ij 

0=1 


m 2 Ci, 




> C 



(D.l) 


where qi a ,ij is defined as 


Qla,ij ( m _ 1 j &ij(X a , Xi x , . . . , JQ m _i). 

£j i ^Oi,k=l,- ■ ■ ,m — 1 

To prove Lemma C.l, we also need the centralized version of q\ a>l j. By , X^ m ) — 

^ij(X ^,... , Xg m ) — U] jj , we define the centralized version of qi a ,ij as 

Qia,ij '■= { m _ ^ r ij(-X"o) • • ■) (B-2) 

' ' !<^l< 

£fc^cx,k=l • ,m — 1 


To bound the left hand side of (D.l), it is easy to obtain 


m( m 2 (m - 1) ^ - x 2 2 a ^ n £n \ 


logg/ 


< q max I 


m 2 (ni - 1) ^ 2 _ 2 

/ j \Qla,ij ^1 ,ij) Cl 




> c 


log 9 


(ni — m) 2 

v 7 a=l 

We then replace q\ a ,ij and u\,ij with their centralized versions q\ a ,ij and ui, tJ to obtain 
m 2 (n\ — 1) 


2 _/ m (n 1 — 1) ^ 2 2 a 

q max P -- tw > {q\a,ij ~ ui ij) - m Ci,ij 

l<i,j<q \ (m — m) 2 


> c 


logg 


= (/ max J 


/m 2 (m-l)<A „ 2 2 a ^A,£n\ 

( (m-m) 2 m Cmj ' - C 1 oko) 


Q —1 


logg/ 


Considering that Ci,ij is the variance of hij{Xf), by setting h\. t j := Y^Ja=i hij (X a )/ri\, we use 
So=i (h j i](X Q ) — hi,ij) 2 /n\ to approximate Ci,ij- Therefore, we insert the term HaLil^ijC^a) — 
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hi'ij) 2 /ni and use the triangle inequality to obtain 

q 2 max f( m ( ni — yZ{qia,ij - Uijj) 2 - m 2 Cl, ij > 

i <i,j<q V (ni — my log q) 

,2 


< q max I 
i <*,■?'<<? 


Zr} n l m )2 - n bb) 2 - Yl(hij(X a ) - h^ij) 2 ) 

' ' a=l a =1 


> -c- £r 


2 log q 


+ q 2 max pf| ^_ ^ ___(^ ^ 2 {hij(X a ) - h ^) 2 - m 2 Ci,ij| > \c~ £r ‘ 


i<*j <9 V 1 (ni-m ) 2 ^ 
x 7 a=l 


2 logg 


^2 

We then introduce an additional lemma to complete the proof. 
Lemma D.l. Under Assumption (A2), as n, q —> oo, we have 

.2 


g max „ . 

l<*.j'<g V (m — to) 


> C-—- j = o(l), (D.3) 
logg/ 


(1 ^1 ~ Ul «) 2 - it(hij(X a ) - ^i,ij) 2 ) 

a=l ct=l 

(l ( n j _ ffiy - hi,ij ) 2 - m 2 Ci,ij\ > C fCfgq) = 0 ^’ ^ D ’ 4 ^ 


9 1 m 2 (ni — 1 ) 

g max 711 ' 1 


where e n = o(l). Results for Y" are the same. 

The detailed proof of Lemma D.l is in Appendix E.l. By Lemma D.l, we have A\ = o(l) and 
A 2 = o(l). Hence, we prove Lemma C.l. □ 

D.2 Proof of Lemma C.2 

Proof. In Lemma C.2, we aim to prove (C.7). For notational simplicity, we set 

V~1 A fno\— 1 A / \ —1 


( ni ) A n , a - ( n2 )~ A m a (n 

Wu = „ nutJ y ' n { and W a , ij = y /^ a ‘ 


A 


riaW 


13 \fm 2 Ci,ij/ni + m 2 C2,ij/ri2 
By (C. 6 ), we have W t j = Nij — Tij. We also have W tJ = c{W\,ij + c 2 W 2i jj, where 

ci = 1 /\Jm 2 Ci,ij +rn 2 C2,ijni/ri2 and c 2 = -l/^/m 2 Ci,ijn 2 /ni + m 2 C 2 . 


(D.5) 


ij- 


To prove (C.6), by setting L 2 := | maxi<jj< 9 (Wj) 2 — maxi<jj< g (Tjj) 2 |, it suffices to prove that as 
n, q — > 00 we have L 2 = o p (l). By the definitions of and Wij in (C.4) and (D.5), we obtain 

L 2 < max \Nij — Tij\ max \Nij + Tij\ < max |Wij|( max 2\Tij\ + max |Wjj|). 

l<i 5i 7<g l<i,j<Q' l<i 5i 7<g l<z, i 7<q' 

Considering max 2\T tJ \ + max | W %3 | = O p (logg), to obtain L 2 = o p (l), we only need to prove 

l<i,j 

p( max |Wj,-| > 1/logg) < g 2 max P(|Wj,-| > 1/logg) —> 0, (D.6) 
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as n, q —> oo. By Proposition 2.3(c) of Arcones and Gine (1993) and Wij = N tJ — Tij, we have 


Wij | > 


log q 


<C exp ( — 


C i( n m/ 2 - 1 /2/io gg )2A 


<r 2 / m + C 2 ((n m / 2_1 / 2 /log g) 1 / m n -1 / 2 ) 


2/(m+l) 


(D.7) 


Combining logq = Oin 1 ^ e ) (see Assumption (A2)) and (D.7), we obtain (D.6). We then have 
A 2 = o p (1). This completes the proof. 

□ 


D.3 Proof of Lemma C.3 

Proof. In Lemma C.3, we aim to prove that as n, q -» 00 , we have P(ma xuj\ eSo (Tij) 2 > y q ) —> 0, 
where is defined in (C.5) and y q = x+A\ogq — log(logg). For notational simplicity, we introduce 

gij{X a ) := E[$ij(X ai ,..., X am )\X a } and gij{Y a ) := E[$^(F ai ,..., Y am )\X a \, 


where <F,;j is the kernel function of the U-statistic u a ,ij■ By Lemma F.3, we have = Var (gij(X a )) 
and C 2 ,ij = Yax(gij(Y a )). We then introduce an additional lemma. 


Lemma D.2. If g tJ is bounded, under Assumption (A2), we have 


max 
(*j')e A 


(ni SaLl 9ij(X a ) no Yha=l9ij{Y a ) U\,ij + U\^jY 


>t 2 < C-rlAKl - ^(t)). 


Cl,ij/ n l + C 2,ij/ n 2 

uniformly for any AC S and t E [0, 0(n 1/,6_e )], where e is an arbitrary positive number. 


The detailed proof of Lemma D.2 is in the supplementary Appendix E.2. Considering logg = 
Q( re i/ 3 -e) (s ee Assumption (A2)), by choosing e < e/2 with e small enough, we have ^Jy~ q E 
[0, 0(n l /®- £ )\. By Lemma F.5, we have 1 —4>(t) < C exp(— C\x 2 ) as is the cumulative distribution 
function of a standard normal distribution. Considering |5o| = o(q 2 ) (see Assumption (Al)) and 
y q = x + 4 log g — log (log q), by setting A in Lemma D.2 as So, we obtain 

p( max (Tij) 2 > yq) < o(q 2 ) exp(-Ciy q ) = o( 1). 


This completes the proof. 


W 


D.4 Proof of Lemma C.4 

Proof. To prove Lemma C.4, we obtain both upper and lower bounds of P( maxi<fc<^(Tfc) 2 > y q ) 
using the Boferroni inequality and normal approximation. By the Boferroni inequality (Lemma 1 
of Cai et al. (2013)), for any integer M with 0 < M < [A/2], we have 

2 M 2M-1 

^(-l)'- 1 P( ^E kj ) < P(maX (T fc ) 2 > y q ) < ^ (-1)'" 1 ^ P( (D.8) 

1=1 l<ki<---<ki<h ^ ~ £=1 l<ki<---<k£<h ^ 
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where we set E kj = {(T kj ) 2 > y q }. By the definition of T k in (C.13), we have 

ni+ri2 _ 

Tk = Ti k j k = ^2 Zp,i k j k l\JnlCi,i k j k /ni + ri2(,2,i k j k - 

P=l 


For any ^-tuple k\,..., kg satisfying 1 < k\ < ... < kg < h, we have 

Z/3k = ^0,i k jk/( n Nl,i k jk/ n ^ + C 2 ,ikjk) ^ anc ^ ^3 = {^0kii ■ ■ ■ j Z0 k( ) , 

for 1 < k < h and 1 < f3 < ni+ri 2 ■ Therefore, T\ = (n 2)~ 1 ^ 2 y'«L t r fl2 Zg k . Define 11 v 11 m i n = min I vg 

p_i M 1 <i<£ 

for any vector veR f . By E kj = {{T kj ) 2 >y q }, we have 


*(&*>)E w AL^vl' 2 )- 

0=1 

We set Ng as a normal vector with the same mean vector and covariance matrix as n 2 l ^ 2 Wg. 

More specifically, Ng := (N k] ,..., N k( ) T is a normal vector with E[JVg] = 0 and Var(A^) = 
nivar(VFi)/ri 2 + Var(W ni _|_i). Therefore, Ng is the normal approximation of n 2 ^ L~i™ 2 Wjg. 

We then aim to use Ng to rewrite obtained lower and upper bounds in (D.8). As \Zp k \ is bounded, 
we set | Zp k | < I \, where K is a constant. By Theorem 1 of Zaitsev (1987), we have 


PI II n, 


ni+n 2 

i ~ 112 Y W ?W™n>y ] /2 
0=1 


< 


’(ll-^llmin > y] /2 -Cnlog q) i/2 ) + C\t 12 exp ( 


n l / 2 e r , 


C 2 PK(\ogq) 1 l 2 )' 


where e n —> 0, which will be specified later. Considering that l is a fixed integer that does not 
depend on n and q, by logg = 0(n l ^~ e ) (see Assumption (A2)), we can let e n —> 0 sufficiently 
slow such that we have 


ci 


^5/2 


exp 


n 


1 / 2 , 


C2£ 3 K(logq) 1 / 2 


) = 0 (q ~ J ) 


for any large J > 0. Hence, we prove (C.17). The proof of (C.18) is similar. This completes the 
proof of Lemma C.4. □ 


D.5 Proof of Lemma C.5 

Proof. We aim to prove (C.19) under Assumption (A3). g t j is defined in (2.9). Assumption (A3) 
requires 

ui, ijk t = n 9 ij(X a )g k g(X a )} = Odlogq)- 1 -* 0 ), (D.9) 

U 2 ,ijkt = E[ gij (Y a )g k g(Y a )\ = O((logg)- 1 -“ 0 ), (D.10) 

for (i,j), (k,£) 6 S \ So, where S = {( i,j ) : 1 < i, j < q} and So is defined in (C.8). By definitions 
of S and So, for (i,j) 6 S \ So, we have \u a> ij\ < 0((log g)~ 1_ “°). Ng is an ^-dimensional normal 
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random vector with zero mean and covariance matrix V^. Combining (2.9), (C.58), (D.9) and 

(D.10), we obtain ||V^ — || 2 = 0((log g) _1_ “°) uniformly for all k\, ki, ■ ■ ■, kg, where k\,...,kg 

1 /2 

is any l —tuple of positive integers satisfying 1 < k\ <...< kg < h. By setting w q = y q ± 
e n (logg) -1 / 2 , we have 


^llmin > yl /2 ± e„(log q) 1/2 ) =P(|iV fcl | > W q , |JVfc 2 | > w q ,..., I N kt \ > w q ) 


" (2vr)^/ 2 | V^l 1 / 2 


exp (-x T V^ 1 x)dx. 

■i >w„ ^ 


We then calculate the integral to obtain 


1 


(2tt)C2|v £ | 1 /2 

1 

'(27t)*/2|V*| 1 /2 


x min^^o 


exp ( — -x 7 7 x) dx 


exp 


(- ^x T V7 1 x)dx+0(exp(-(logg) 1+a °/ 2 /4)). 


11 x ||min5 || x || max^ (log g) 1 / 2 + a o/ 4 
Considering that 1 1 V£ — 1^1 1 2 = 0((log q) -1-00 ) holds uniformly for all (kg, k2, ■ ■ ■, kg), we obtain 

1 r 


(27r) t / 2 | V^l 1 /- J||x|| min >tij g ,|jx|| m ax<(log< 2 ) 1 / 2+Q: o/ 4 

1 + O((logg) _ “ 0//2 ) f 

j ||x|| m in>rUq,||x|| max <(logg) 

1 


(2vr)C2 

1 + 0((logg) _ “°/ 2 ) 


(2tt)C2 

We then calculate the integral to get 
1 + 0((logg) _ “°/ 2 ) 


e x P 1 - 2 X x 


(- ^x T 'V^" 1 x^ dx+O (exp (—(log q) 1+a °^ 2 / 4)) 
exp f-^x r xW+C>( exp (-(log q) 1+a °/ 2 /4)) 

l/2+ct 0 /4 V 2 / V / 

dx + O ( exp (- (log q ) 1+00 


x min^'^a 


(2tt)C2 

= (1+o(1)) (^ exp( "5 ) ) 4 

Therefore, as n, q —> 00 , we have 


1 


exp ^ — -x T xjdx + O^exp ( —(logg) 1+ “° 


-it 


\N e \\ min >yl /2 ±e n (logq) 1/2 ) = (l + o(l)) exp(-|)^ q 2£ . (D.ll) 


Considering C[ = h!/(£!(h — i)\) and 2 h/{q 2 ) = 1 + o(l), by (D.ll), we prove (C.19). Hence, we 
complete the proof. □ 


D.6 Proof of Lemma C.6 

Proof. By Theorem 2.2, under Assumptions (Al), (A2) and (A3), we have 

- 4 log q + log (log q) < x^j ->• exp ( - ^=exp(-|)j. (D.12) 
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However, in Lemma C.6, we don’t assume (Al) and (A3), so that we cannot get (C.24) from 
(D.12). 

In Lemma C.6, we aim to prove (D.12) under Assumption (A2). By (C.3) and (C.7), we have 


Ml, — max —— 
(hj)eS 

By Lemma D.2, we have 


(m Stt=l 9ij(X a ) n 2 X)a=l 9ij(Xa) U 1 ,ij + u l,ij)‘ 


Cl,ij/ n l 4 “ C2,ij/l^2 


= o P (l). 


(D.13) 


P [ max 
. (m')eS 


(I'E'S.iSii&c) l■ YS. l 9iAY a ) > W < Cl|s|(1 _ m)t (D14) 

4 “ C2,ij/n-2 J 

where <3>(t) is the cumulative distribution function of a standard normal distribution. By Lemma 
F.5, we have 1 — 4>(f) < exp(—t 2 /2)/(t\/27r) for t > 0. Considering |5| = q 2 and (D.14), by setting 
t 2 = 41og(? — 0.5 log(log q), under Assumption (A2), we get 

( (sr Sa=l 9ii{X a ) — — Yla=l 9ij(Xa) ~ n l ,ij + U 1 ,ij)‘ 


P ( max m 


Cl,ij/ n l + (2,ij/n2 

Combining (D.13) and (D.15), we prove (C.24). This completes the proof 


>4logg—-log(logg) )=o(l). (D.15) 


□ 


D.7 Proof of Lemma C.7 

Proof. By setting S = {( i,j ) : 1 < i,j < q}, Lemma C.l implies that both of the following two 
events 

£\ ■= { max \nid 2 (uiij) - m 2 Ci,ij\ < £ 2 := { max \n 2 cr 2 (u 2 ,ij) ~ m 2 ( 2 ,ij\ < C ^—}, 

,J ' log q 1 Cbi)6s' ,J ' log q ' 

happen with probability going to one as n,q — > 00 . Under £\ and £ 2 , by f a ,ij > r a > 0 (see 
Assumption (A2)), we have 


nid 2 (u lti j)/(m 2 Ci,ij) - 1 <Ce n /\ogq and n 2 a 2 {u 2 ,ij)/(m 2 C 2 ,ij) - 1 <Cs n /logq. 
Therefore, under £\ and £ 2 , we have 


max 


[u\ jj U2 : ij) 


— max 


(^1 ,ij U2 ,ij) 


( i,j)eS d 2 (ui,ij) + o 2 {u 2 ,ij) ( i,j)eS m 2 Ci,ij/ni + m 2 ( 2 ,ij/ri2 

~ U 2 ,ij) 2 ('Uijj - U 2 ,ij) 2 


< max 


d 2 (ui,ij) + a 2 (u 2 ,ij) rn 2 Ci,ij/ni + m 2 C, 2 ,ij/n 2 
Hence, as n, q —> 00 , we have 


< C 


max 


1 .ij ^2.1 j ) 


— max 


1 .ij 1^2,ij) 


log q 


— o p (l). 


a 2 (u ltij ) + a 2 {u 2 ,ij ) (iJ)eS m 2 Ci,ij/ni + m 2 C, 2 ,ij/n 2 
Considering (Uj^U^) E A(4), by (D.16), we obtain (C.25). This completes the proof. 


(D.16) 

□ 
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D.8 Proof of Lemma C.8 


Proof. Considering the definition of lVT’ ps in (C.8), by (3.8), we have Mf’ ps = max( ij ) g5 (A^’ ps ) 2 . 
By setting S = {( i,j ) : 1 < i < j < d}, we aim to prove that as n , d —> oo, we have 


, m .^S N Ij PS ) 2 ~ , m fx (f^) 2 

(*j)6 S (*j)6 S 


Op ( 1 ) , 


(D.17) 


where we define T % j in (C.42). By setting := | max(jj)gs(./VT’ ps ) 2 — max(jj)gs(Tj ? ) 2 |, we aim 
to prove L^ ps = o p (l). To prove L T f pb = o p (l), we construct an upper bound of L 2 ’ ps as 

Lff ps < max \Nl^-T tj \ max \Nj? ps + Tij\ < max |Wy|( max 2|Tjj| + max |Wjj|). 
b,i)es 3 (i,j)£S 3 (Liles' (ij)es (ij')es 

Considering max^j-w- 2|Tjj| + max^^gg |Wjj| = O p (logd), to show Lg’ 1 * 5 = o p (l), we only need to 
prove 

pf max | W tJ | > 1/log d') < d 2 max P(|Wij| > 1/log d) —> 0, (D.18) 

V (■ i,j)es ) (■ i,j)es 

as n, d —>• oo. By Proposition 2.3(c) of Arcones and Gine (1993), we have 

/ -—. 1 \ / „ /r 7 /lr»rr /7 


Considering logd = C^n 1 / 3-6 ) ( 


<C exp — 


C-2 vWl°g d 

a + Ciflogdy/n )- 1 / 3 


as n, d —> oo. Hence, we complete the proof. 


^logd, x . 0 _ v ._, , 

Assumption (A2)), by (D.18) and (D.19), we obtain 
\Wij\ > 1/logd) ->• 0, 


(D.19) 


see 


max 

(*J)eS 


£1 


D.9 Proof of Lemma C.9 

Proof. <r 2 ps and gjj are defined in (3.7) and (C.37). We aim to prove (C.44). For this, we introduce 
the following lemma. 


Lemma D.3. By setting S = {( i,j ) : 1 < i < j < d}, under Assumption (A2), we have 


max 

(m)s a 


^ Eal 1 9ij(X*) - h Ea= 1 9ij(Ya) ~ n.ij + r l 


n 2 


CT l,ps/( 4 «l) + ( 7 l,ps/( Z M 


>t 2 <C|A|(l-4>(Cif)), 


uniformly for any A C S and t 6 [0, 0(n 1//6 e )], where e is an arbitrary positive number. <&(t) is 
the cumulative distribution function of a standard normal distribution. 


The detailed proof of Lemma D.3 is in the supplementary Appendix E.3. By Lemma D.3, we 


have 


p( max (%) 2 > yd) < C’|5 , 0 |(l - $(Ciy/y£)), 

V (i,j)<=S 0 ' 

(D.20) 

where yd = x + 41ogd — log(logd). By Lemma F.5, we have 


1 — 4>(f) < exp(—t 2 /2)/(tv / 27r) for t > 0. 

(D.21) 

Considering So = o(d 2 ) (see Assumption Al), by (D.20) and (D.21), we have P( 

max^gSoOT/j) 2 > 

Vd) = o(l). This completes the proof. 

□ 
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D.10 Proof of Lemma C.10 


Proof. In Lemma C.10, we aim to prove |4 Q a ,ij — o'psl < C\r a jj\. For simplicity, we ignore the 
subscript a (a = 1 or 2) for X and Y. By (C.28), we have 4 Qj = 16(n cC! jj — (II Ci j.,) 2 ), where 
II CCj jj and II c> ij are defined in (3.4) and (3.2). By the definition of the meta-elliptical distribution 
in Definition A.2, we have 

(X ai , X a j)=(Ra cos (Q a ),R a sin(@ a + u)), (D.22) 


where R a is a positive radial random variable and 0 Q is an independent, uniformly distributed 
angle on [—7r, 7r] with a = 1,2, ••• ,m and u = tt/2. By the definition of <r 2 s in (3.7), we 
have cTp S = 16(11“=^ — (II“~P) 2 ), where n“ c =p - and 11“=° is the Ii CC) ij and II c jj under the condition 


CM 


CC,IJ 


c,ij 


U = TijTt/2 = 0. 

By the definition of H cc ,ij in (3.4), Yl. cc ij is the probability of four parts: 


Ei 

E 2 

E-, 

Ei 


= {X 2 i > Xu,X 2 j > X Xj , X 3l > X u , Xij > X\j}, 
= {x 2 i < Xu,X 2j < Xij,X 3i < X u ,X 3j < Xij}, 
= {X 2 i > X u ,X 2 j > Xij,X 3i < Xu,X 3j < Xij}, 
= {X 2 i < Xu,X 2 j < X\j. X 3l > Xu,X 3j > Xij}, 


i.e., n cc jj = P(Ei) + P(E 2 ) + P(-E 3 ) + P(Ei). By (D.22), we rewrite the event E\ as 

Ei = {R 2 cos(0 2 ) > Xij, R 2 sin(© 2 + u) > Xij, R 3 cos(0 3 ) > Xu, R 3 sin(0 3 + u) > Xij}. 

We then define g u {xu,xij,r 2 ,r 3 ) as 

g u {xu,xij,r 2 ,r 3 ) := P[£d|Xij = xu,X Xj = xij,R 2 = r 2 ,R 3 = r 3 ]. (D.23) 

By P(Ei) = E F[Ei\Xii,Xij,R 2 ,R 3 ] , we have P(£i) = E[g u (Xu, Xij, R 2 , R 3 )]. By (D.22) and 
(D.23), we have 


\g u {xn,xij,r 2 ,r 3 ) - g u =o(xu,xij,r 2 ,r 3 )\ < Cnj, (D.24) 

where C is a positive constant. By (D.24), we obtain 


E[gu(Xu, Xy, R 2 , R 3 )] 


!E[<7u=o(ATij, Xij,R 2 , i? 3 )] 


< C\Tij 


By similar proofs on E 2 , E 3 and E±, we have \Ii cc jj — II“ C < C\rij\. Considering the boundedness 
of n c .jj, we also have | (II C) jj) 2 — (JXffj) 2 \ < C\rij\. We then complete the proof by cr 2 s = 16(II“ ( “° — 

(n“=°) 2 ) and 4C ij = 16(n cctij - (n’^) 2 ). ’ □ 


D.ll Proof of Lemma C.ll 

Proof. To prove Lemma C.ll, we set 


M 1,plug 

±y± n 


max 


(Tijij 7"2 ,ij n,zj H - r 2 ,ij) 

? plug + ^plug( f 2 ,ij) 


and M,(’ plug := max 
(ij)es 


(D .ij T 2 ,ij T~i t ij + T 2 jjf) 

4(i,ij/ni + 4^2 ,ij/n 2 
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where S = {(i,j) : 1 < i < j < d}. By Theorem 3.2, under Assumptions (Al), (A2) and (A3), 
we have 

P^M* ,plug — 4log ci + log(logd) < Zj — > exp ^ — ^_ exp(—(D.25) 

However, in Lemma C.ll, we don’t assume (Al) and (A3). Therefore, we cannot obtain (C.48) 
from (D.25). 

In Lemma C.ll, we aim to prove (C.48) under Assumption (A2). By (C.26) and (C.7), as 
n, d —> oo, we obtain 


M T J ,plug — max - vni 


(£ ESLi 9UX a ) - £ EZi gUY«) ~ n,a + n<r 


-M) 


Cl,ij/ n l + ( 2 ,ij/n 2 


where gj, is dehned in (C.37). By Lemma D.2, we have 


_L „r 

P ( max ^1 


EaLl Sy'(^a) “ ^ EZi 9ij( Y a) ~ n,ij + r l,ijY 


= o p ( 1), (D.26) 


>r <Ci|5|(l-$(t)), (D.27) 


~f~ C2,ij/lT'2 

where 4>(t) is the cumulative distribution function of a standard normal distribution. By Lemma 
F.5, we have 1 — 4>(t) < exp(—t 2 /2)/(t\/27r) for t > 0. Considering \S\ = 0(d 2 ) and (D.27), by 
setting t 2 = 41ogd — 0.5 log(log d), under Assumption (A2), we obtain 

(f 7 EaLi 9 li(x a ) - A YJZi 9ij(Ya) ~ n,a + n,ij) 2 


’( max 
\(*d)eS 


>41ogd—^log(logd) j=o(l). (D.28) 


Cl,ij/ n l + C 2 ,ij/ n 2 

Combining (D.26) and (D.28), we prove (C.48). Hence, we complete the proof of Lemma C.ll. □ 

D.12 Proof of Lemma C.12 

Proof. By setting S = {(i,j) ■ 1 < i < j < d}, (C.26) implies that both of the following two events 

£3 := ( max UK («i,ij)-m 2 Ci,jj| < C £n ), £4 := { max \n 2 ah u Ju 2 ,ij)-m 2 ( 2 ,ij\ < C " ). 

i(i,j)es' p g 1 loggJ 1 (i,j)es' p g 1 loggJ 

happen with probability going to one as n, d —> oo. Under £3 and £ 4 , by Q a .ij > r a > 0 (see 
Assumption (A2)), we have 


max 




- 1 


< c 


log q 


and 


max 


m 2 Ca,ij 

Therefore, under £3 and £ 4 , as n, d —> 00 , we have 

( r l ,ij — T 2.,ij) 2 ( r l ,ij — T 2 ,ij) 2 


rn 2 (a, 




1 <i<j<d n a a 2 lug Za,ij ) 


-1 


< c 


log q 


max 


< max 
(iJ)eS 


(i,j)es ^i ug {n tij ) + $l lug (T 2 ,ij) (■ i,j)es m 2 Ci,ij/ni + m 2 Q 2iij /n 2 

1 ,ij T~2 ,ij) (t 1 ,ij T2,ij) 


^piug (n,v) + ? piug( f 2,ii) m 2 Ci,ii/ni + m 2 (2,ij/n 2 

Hence, as n, d —>■ 00 , we have 

( r i ,ij ~ T 2.,ij) 2 ( T Mi — r 2 ,ij) 2 


< C- 


max 


— max 


logd 


— Op ( 1 ). 


(i,j)£S a 2 lug (?i ,„■) + CT 2 lug (u 2 I ij) (ij)eS m 2 Ci,ij/ni + m 2 C 2 ,ij/n 2 
Considering (U^Up G U(4), by (D.29), we obtain (C.49) to complete the proof. 


(D.29) 

□ 


46 































D.13 Proof of Lemma C.13 

Proof. We aim to prove (C.51) under Assumptions (Al) and (A2). For Kendall’s tau, we have 


T ■ — 


1 

m £—/a=l 


T ■ — 


1 y^ni 

m A(q=1 


Considering the definition of M/)’ ps 


Af*’ ps - max 

(idles 


/ 1 

Vrii <ot=l 


9lj{X a )-±Y%=i9ii(Y a )- 

\f a l,ps/ 4ni + CT 2,ps/ 4n 2 

+ (2,ij/n2 

in (C.50), by (C.43), we have 

^,ps/ 4n l + CT l,ps/ 4n 2 


n,zj H - Tl,ij 


n,zj H - Tl,ij 


n,zj H - n,zj) 


(D.30) 

(D.31) 


0p(l), 


where 5 = {(z,j) : 1 < z < j < d}. Therefore, to obtain (C.51), it suffices to prove as n, d —> oo, 
we have P( ma ^(ij)es(Tj ) 2 > y d ) = o(l), where y d := 4log d - log(logd)/2. 

We then aim to prove that as to, d —> oo, we have 


max (Tji ) 2 


> Vd - 


max (Tij) 
(i,j)eS\S 0 


> Vd 


= 0 ( 1)7 


(D.32) 


where So is defined in (C.8). To prove (D.32), by 


> “) - p ( ( TAK <fy)2 > “ 


< 


max (Tij) > z/d), 
(*j)eS 0 


(id)es 

we need to prove P( ma ^-(i,j)eS 0 (Tij ) 2 > Vd) = o(l) as n, d —> 00 . By Lemma D.3, we have 


max (Tij ) 2 > y d ) < C'IS'ol (l - <F(Ci \fyd)), 
(*d)eS0 2 


(D.33) 


where <F(i) is the cumulative distribution function of a standard normal distribution. Lemma F.5 
claims 1 — <J>(t) < exp(—t 2 /2)/(i\/27r) for t > 0. Considering |So| = o(d 2 ) (see Assumption (Al)), 
by (D.33), we have P^ma x^j^ £So (Tij ) 2 > yaj = o(l). Therefore, we prove that as n ,d —> 00 , 
we have (D.32). To obtain (C.51), by (D.32), it suffices to prove that as n, d -) 00 , we have 
P(max ( jj )e g\s 0 (Tij ) 2 > y d ) = o(l). 

Considering the definition of So in (C.8), by Lemma C.10, for any ( i,j ) E S \ So, we have 
l 4 Ca,ii - o^psl < C'lra^l < C'(logd)' 1- " 0 . By ( adj > r a > 0 (see Assumption (A2)), for (i,j) E 
S \ So, we then have 


ka, P s/ 4 Ca,ij - l| < C'(logd) 1 ao and 
Therefore, by (D.34), for (z,j) E S\ So, we have 


| 4 Ca,y/(Ta,p S ~ 4 | < <?(logd) 1 


— l—ao 


(D.34) 


(T^) 2 -^) 2 , < ,< ps -4Ci, 


(Tij) 2 




a 


l,ps 


(7 • 


2,ps 


- 4 C2,- 


■V 


a. 


2,ps 


<C(logd) 


-l-a 0 
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Therefore, for (i,j) E S \ So, we have \(Tij ) 2 — (Tij ) 2 \ < C (Tij ) 2 (log d) 1 a °. We then have 


max (Tij) 2 — max (Tj ) 2 
(i,j)eS\S 0 (i,j)£S\S 0 


< max 
(i,j)eS\S 0 


(Titf-iTij ) 2 KCilogd)- 1 -"* max (Tij) 2 . (D.35) 


Considering rna x-(i,j)es\S 0 (Tij) 2 = O p (logd), by (D.35), we have 


max (Tij ) 2 — max (TL) 2 = o„(l). 

(i,j)es\s 0 (i,j)es\s 0 


Therefore, it suffices to prove 


. (i,j)es\s 0 

as n, d — > oo. By Lemma D.2, we have 


(,i.^ s S Tii)2>id ) =o{1) ’ 


(D.36) 


n i 


n 2 


(£ £ al 3 (x a ) - x E - n# + n 

a=l 


max — 

(i,j)es\So 


n 2 ^ ^3 ' 
a=l 


ij) 


C,i,ij/ni + C 2,ij/ri2 


>t 2 )<C 1 \S\S 0 \{1 -$(t)). (D.37) 


By Lemma F.5, we have 1 — 4>(t) < exp(—f 2 /2)/(i\/27r) for t > 0. Considering |S \ Sol = 0(d 2 ) 
(see Assumption (Al)) and (D.37), by setting t 2 = 41ogd — 0.51og(logd), as n, d —> oo, we obtain 

(rL EaLl 9ii(Xa) ~ ^ Elh 9ij(Ya) ~ Xjj + T Uj ) 2 ~ ^ 

- > Vd = 0 ( 1 ). 


max 


(i,j)es\s 0 (i,ij/ni + (2,ij/ri2 

Therefore, by (D.31), we prove (D.36). Hence, we complete the proof. 


□ 


D.14 Proof of Lemma C.14 

Proof. We write X± = Zq + Z\ with Z\ = g 1 5] _ 1 (X 2 , A 3 , X 4 ) T and Zq = X\ — Z\. We then obtain 
that Zq is independent of (X 2 ,X 3 , X 4 ), Z\ ~ N( 0, g 1 XD 1 ^) and Zq ~ N(0, 1 — g 1 E -1 £>). We set 
(T 2 := g I T;~ 1 g. Accordingly, we write 

E[(T(Ai) - 1/2)($(X 2 ) - 1/2)($(X 3 ) - 1/2)(4>(A 4 ) - 1/2)] 

E[$(*i) ~ 1/2 I Zi] • E[(4>(X 2 ) - 1/2)($(X 3 ) - 1/2)(§(X 4 ) - 1/2) | Zjl. (D.38) 


= E 


We first focus on bounding |E[<L(Xi) — l/2|Zi]|. We use 1 3 to denote it. Given Z\, we have 
Zo + Z\ | Z\ ~ N(Z\. 1 — u 2 ). Hence, we have £3 = E[<!>(£) — E$(ry) | Zi] with £ ~ N(Z\, 1 — a 2 ) and 
?7 ~ A(0,1). Because the equation holds for any joint distribution of N(Z\, 1 — a 2 ) and N( 0,1), we 
have £3 = |E[$(Zi + \/l — cx 2 l 0 ) — E<L(Yo) | Zi] |, where lo ~ 1V(0,1) is independent of Z\. Then 
we have 

h < E[|$(Zi + Vl - (T 2 Ho) - $(*o)| I £ 1 ] < 4=E[|^ + vT^y 0 —y 0 ||Zi]. 

V 27t 


Given Zi, because Z 4 + y/l — <t 2 Yq — Yo ~ N(Z\, (1 — \/l — <r 2 ) 2 ), denoting o\ = (1 — \/l — c 2 ) 2 , 
we then have 


1 


E [$(X 1 )--\Z 1 \ 


< 


1 


v/2ttL 


02 ^/ 2/77 exp(-Zf/2a 2 ) + Z 4 [ 1 - 24>(-Zi/ct 2 )] 


(D.39) 
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Moreover, it is easy to obtain |<3?(-) — 1/2| < 1/2. Therefore, we have 

E[($(X 2 ) - 1/2)($(A 3 ) - 1/2)($>(X 4 ) - 1/2) |Zi]] < 1/8. (D.40) 

Combing (D.38), (D.39) and (D.40), we then have 

E [$(*!) - 1/2)($(A 2 ) - 1/2)($(X 3 ) - 1/2)($(X 4 ) - 1/2)] | 

£ £ + i^ E [ z ‘ (1 - 24,( - Zi/< ' 2)) f (D ‘ J1) 

Considering E[Z\] 2 = a 2 and E[4> 2 (Zi/< 72)] < 1, we have 

E[Zi(l — 2<h(—Zi/cr 2 ))] = 2 E[Zi$(Zi/<t 2 )] < 2E[Z 1 2 ] 1/2 E[$ 2 (Zi/ct2)] 1/2 < 2 a. (D.42) 

Noticing (j 2 < a, combining (D.41) and (D.42), we obtain 

|E(<KpC) - 1/2)(4>(X 2 ) - 1/2)($(X 3 ) - 1/2)($(X 4 ) - 1/2)| < Q- + 1 


4v^) 


a. 


This completes the proof. 


□ 


D.15 Proof of Lemma C.15 

Proof. First, we prove that only when r = 1 and the set {pi, p 2 , ai,..., a 4 } attains the boundary, 
we have C r = 1. When C r = 1, we have 

(HXi) - 1 / 2 )($(X 2 ) - 1 / 2 ) = a • ($(X 3 ) - 1/2)(4>(X 4 ) - 1 / 2 ), 


for some constant a. This implies that 


X, = 4 > _1 


f a-( $(X 3 ) - 1/2)($(X 4 ) - 1/2) 

V $(X 2 ) - 1/2 



We have X\ ~ IV(0,1) if and only if we have 


g-m- 1/2)($(X 4 ) - 1/2) 
$(X 2 ) - 1/2 


~ Unif(—1/2,1/2). 


When X -2 ±X 3 and X 2 7 ^ ±A 4 , there is always a possibility such that W 2 is very close to 

zero and both A 3 and X 4 are away from zero. Under this condition, (a ■ (<h(X 3 ) — 1/2)(4>(X 4 ) — 
l/ 2 ))/( 4 >(A" 2 ) — 1/2) is very large and outside [—1/2,1/2]. Therefore, X 2 must equal ±A 3 or XX 4 . 
Equivalently, {pi, P 2 , a 4 ,..., a 4 } attains the boundary. This completes the proof of the first part. 

Secondly, it is obvious that there is a one-to-one map between r and C r . Accordingly, as long 
as r < 1, C r < 1 only depends on r. □ 


E Proofs of Lemmas in Appendix D 

In Appendix E, we present proofs of three lemmas introduced in Appendix D. 
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E.l Proof of Lemma D.l 


Proof. To prove (D.3) and (D.4), we need to analyze the following two terms: 

2 ( i \ n n i 

A . = m i m ( n i - !) 


Ao := 


( ( n _ m ) 2 ^ u Mj ) 2 — *)> (E-l) 

' ' ol =1 a=l 

(l Tn ( -mP X>4*«) - '‘■•W) 2 - ™ 2& «l £ *)■ (E ' 2) 

We bound Ai and A 2 separately. A -2 represents Y^=i(hij(X a ) — hi,ij) 2 /nfs approximation error 
for Ci,ij- Considering the definition of Ca,ij in Lemma F.3, by setting h\ , L j = hij(X a ), we 

have 

ni ni 

^{h ij (X a )-h 1 j j ) 2 = ^h%(X a )-n l (h l j j ) 2 and (i,ij = E[/i?-(X a )]. (E.3) 

ct=l a=l 

Thus, by using the triangle inequality, we obtain 
,2/„ n n i 


'( 


m 2 (n\ — 1 ) 


(ni — m) 2 

' a=l 


Y2(hij(X a ) - hi t ij) 2 - m 2 Cl ,ij > t/2^j 


< 


n 1 

-^^.(X a )-E[4(x a )] 


a=l 


> C\t) +P i(h hij ) 2 >C 2 t). 


By Lemma F.l, we use the Hoeffding’s inequality to obtain 


Til 


^ h 2 3 (X a )/ni - E[/i?-(X a )] > Citj < C 3 exp(-C , 4 nit 2 ) 


a=l 


(E.4) 


Considering hij(X a ) is bounded, by E[/ijj(X a )] = 0, we use Hoeffding inequality (Lemma F.l) to 
obtain 

j) 2 > C 2 t) = P(|^i,y| > y/Citj < C 5 exp(-C 6 nit). (E.5) 

Hence, combing (E.4) and (E.5), we have 

A 2 < C 3 exp (—C^nit 2 ) + C 5 exp(—Cgnit). (E. 6 ) 

By (E. 6 ) and log q = 0 (n 1//3 ~ e ) (Assumption (A2)), we have (D.4) by setting e n = l/(log q) K °, 
where ko > 0 is sufficiently small. 

For Ai, we use (E.3) to rewrite A\ as 

T»^-m)^ (( iW “ n i(“i«) 2 ) - ( £(^( X “)) 2 “ n i(^i«) 2 

' ' rv— 1 rv=1 


))| > t/2). (E.7) 
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Similarly to A2, we rearrange terms in A\ and use the triangle inequality to obtain 

2/ 1 \ m n i 

m yn\ — 1) 

U - ™H'«1 ,ij) ) - \ 

a =1 

{nij{JL a )' 2 ' 

a =1 


< 


(ni — m ) 2 

m 2 (ni — 1) 


711 711 

(( E^ 1 a,ij ) 2 - - ( J 2 (h,(X a )f - niihijj) 2 )) > t/ 2 ^ 


a=l 

2 (— i\ n i 


n\ 


(ni — m ) 2 

x 7 a =1 


E ^) 2 - E (M*«))) 


> 


(E.8) 


h 


+ ■ 


m 2 ni(ni — 1) 


(ni — m ) 2 


{{ui,ij ) 2 ~ (hi,ij) 2 ) > t/ 4 ) . 


12 


Therefore, to bound A\, we need to bound Ii and I2 separately. By 

(^1 ,ij) (h 1 ) — (n 1 .ij + hi } ij)(ui t ij (2-1,17) and T h\ij\ 4 C , 

we use the triangle inequality to bound I2 and get 

I 2 < P^|ni,i 3 - - hi,ij\ > Ct) < > Ct/ 2 ^ +P^i ! y| > Ct/ 2 j. 

We know that u\.ij and flip's kernel functions are bounded. Noticing E[«i ^j] = E[^i,ij] = 0 , we use 
the exponential inequalities (Lemmas F.l and F. 2 ) to obtain I2 < C\ exp(—C2nif 2 ). Considering 
logg = o(n 1//3_e ) in Assumption (A 2 ), by setting e n = l/(logg) K ° with k,q > 0 sufficiently small, 
we have 


q~ max J 
1 <M <9 


m 2 ni(ni — 1) 


((“1 ,ij ) 2 ~ (hi,ij) 2 ) 


(m — m) 2 

To prove (D. 3 ), by (E.8) and (E. 9 ), we only need to prove 

2( \ ni n 1 

m [n\ — 1) 

l / AW-aw) — 

a=l 


> C- 


log q 


= o(l). 


q max J 


(ni — m ) 2 ^ 

x 7 a=l 


n 1 7ii 

- E (M*«)) 2 )| > c^) = 0(1). 


(E. 9 ) 


(E.10) 


For this, we need to bound I\. To bound I±, we introduce the following decomposition of qi a ,ij- 
We set 



E 

^■(Xa,^,.. 

* ’ ^-^m— 1 ) 









with A = ( 

'm-l\ /ni-2\ R 

.mi—1/ \m—2/ > 

= cri). 

rnll 
g ^ 

w 

II 

-vd 0 ) 
1 l*j 

ni 



=- E 

X,, .. 

. ,_X> _1 

* t-TTi -L 


l<^ 1 <...<£ m _ 1 <n 1 




£ j- ^ a, j =1,..., m—1 




1 ij ’ 


(E.ll) 


771 — 1 


k =1 


L 4 


(E.12) 


Furthermore, we set 


Vi lI-:=E(^( X “)) 2 > A W : =E( T S) 2 and D 


a=l 


a=l 


ni — 1 
m — 1 


(E. 13 ) 
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By qia^j's definition in (D.2), we have q\ a .ij = ( Ahij(X a ) + BSuj + T[“■)/!?. By setting L 4 as 


Li : = 


, ni ni 

-(52 E TM*.) + BS W + T<“>) 2 - E (M*«» 2 ) 


o:=l 




(E. 14 ) 


bounding Ii is equivalent to bounding P(L 4 > Ct). By expanding X)aLi( f H«,*j) 2 ’ we get 


ni ni n\ 

X><Mi) 2 = ( A X’ + + 2A5 )(^ib) 2 + 2A E + 2S '5ib- E T S) /H 2 - 


a=l 


Oi=l 


a=l 


To bound L 4 , we introduce 

Ji ■= U 2 - O 2 )^/(0 2 m)|, J 2 := \Ay(D 2 m)\, 

h := |(m B 2 + 2AB)(S lij ) 2 /(D 2 m)|, J 4 := 

We use the Cauchy-Swartz inequality on XXLi ^uj to get 

n\ 

2 BS Uj E T S lj/ D2 < 2B\S lij \^A lij /(D 2 n 1 ). 

a=l 


This motivates us to set J5 := 2 B| 5 ij J |y / niAijj/(L) 2 ni). By using the triangle inequality on L\, 
we obtain L\ < J\ + J 2 + J3 + J 4 + J5. For J 4 and J5, we use the Cauchy-Swartz inequality on 
Siij = Y)pL 1 hij(Xp) to yield 


L 4 + J5 < 


This motivates us to define Jg := | 2 (T + Bni)V\ijkuj/n\D 2 \. Therefore, to bound L 4 , we only 
need to bound J 4 , J 2 , J3 and Jg separately. 

By the definitions of A and D , we obtain 




^BtI\ V\ij A-lij 

< 

2 T + 2 Bn\ , 

niD 2 V U 3 Au 3 

1 

n 4 O 2 


nijD 2 


A = 0 (raf 1 ), £> = OK 1 " 1 ) 


m—1 \ 


and 


D — A = 


n\ — 2 
m — 2 


= ok -2 )- 


Thus, for Ji, by the definition of Vi,*? in (E. 13 ), we use the Hoeffding inequality (Lemma F.l) to 
have 

Ji = P^ Of) < Ci exp(-C 2 nff 2 + C 3 n 2 f). (E. 15 ) 

Similarly, for J3, we use the Hoeffding inequality (Lemma F.l) to have 

(Suj ) 2 


h = P 


nr 


> Ct) < Oiexp(- 0 2 nH). 


(E. 16 ) 


To bound J 2 , by A 2 , ;j := X^LKiij) 2 ; we have 


A 2 


,2m-1 — 
1 


n 


>00 = p 


tl 1 ni 

(E( T S ) 2 ^ CnfK) < E P (( T S ^) 2 > CnK 2 t). 


a=l 


ct=l 
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By the definition of in (E.12), given X a , we can treat 

m —1 

-^Qi . . . , Xg m _i ) — (hij(X a ) + ^ ^ hij(Xt k )'), 

fc=l 


as a symmetric kernel function. Therefore, given AC a , /I? is a U-statistic with a kernel function 
of zero mean and m—1 order. Hence, by Lemma F.2, it follows that we have the following inequality: 

P((Yg) 2 > Cn\ m -' 2 t\X a ) = P(|y£]| > Cn^VilXa) < Ci expt-CW). (E.17) 

By taking expectation of (E.17), we have 

P((^nj) 2 ^ Cn\t) < Ci exp(-C 2 nit). 


Hence, for J 2 , we construct the following bound: 


A 2 - ■ 

<h < 2 m— i ^ < C'iniexp(-C' 2 nif). 


n. 


(E.18) 


At last, by A = 0(n™ x ), B = 0(n™ 2 ) and D = 0(n™ 1 ), bounding Jq is equivalent to 
bounding ¥ (Vuj kuj / n™ > Ct). We then have 


J 6 = P 


V 2 A 2 


2—2K,2m—2-\-2hi — 

n 1 n 1 


> ct 2 < 


V; 


lij 


^2—2 k, — 
1 


> ct 


n 


M 


A 2 
A 1 ij 


> Ct 


n 


2m— 2+2k — ) ’ 

1 


where k > 0 will be given afterwards. Similarly to (E.15), we obtain 


V? 


^t>Ct) < Cl exp (-C^-^t 2 + C 3 n(-' 2K t). 


n 


2-2 k 
l 


3—4kj.2 


2 — 2 k . 


For the other term in (E.19), we have 


OL= 1 


v 1 1 

Similarly to (E.17), we have 

P(|Y^| > C\Jn 2m ~ 3+2K t^j < C 4 exp(-C' 5 nft). 
Therefore, Combing (E.19), (E.20), (E.21) and (E.22), we obtain 

J 6 < Ci exp (-C 2 nl~ iK t 2 + C 3 n\~ 2K t) + C 4 m exp (~C 5 n\ K t). 


(E.19) 


(E.20) 


A 2 n 1 n 1 - 

( 2 J?+2 k > «) < E P (( T S ) 2 ^ Cnl m ~ 3+2K t) = E F (l T Sl > C \J n 2rn ~ 3+2K t) . (E.21) 


(E.22) 


(E.23) 


Noticing logg = O^r 1 / 3-6 ) (see Assumption (A2)), if we set k = 1/3, combining (E.15), (E.16), 
(E.18) and (E.23), we have (E.10) by setting e n = l/(logg) K ° with no > 0 sufficiently small. This 
completes the proof. □ 


53 










E.2 Proof of Lemma D.2 


Proof. By the definition of Zg jj in (C.12), we obtain 


n i 

( 9ij{X a )/ n l 


a=l 


ri2 

E gij{Y a )/n 2 - Tl'ij + r 2 ,ij ) 


a=l 


2 


(l,ij/n>l + (2,ij/n2 


711+712 ^ r , ni+n2 ^ 

( E E (ZuP 

- X 2 . 8 7 -—, (E.24) 

n Vt, n2 /;+ \ 2 n iCi,ij/m + n 2 (2,ij 

/L \^P,ij) 

0=1 


where Ci,*i and ( 2 ,ij are variances of gij(X a ) and gij(X a ). By the Bernstein’s inequality, for any 
M >0, we have 


max 

(ij)es 


max 

(*d)eS 


1 1 Z 

— E (■ z P,ijY —I Ci,' 

rai “ nf 

711+712 

- E (%«) 2 -& 


> C 


logg 


> C 


3=711 + 1 


logg 


= 0(q ~ M ), 
= 0 (g- M ), 


(E.25) 


by letting converge to zero sufficiently slowly. Considering > r a > 0 (see Assumption (A2)), 
by (E.25), we have that the two events 


ET= 1 (3w) : 




- 1 


< c 


logg 


and 


E 711+712 V 

/3=ni+l {^Pm) 


n 2 C 2 . 


- 1 


*3 


< C 


log g J ’ 


happen with probability going to one, as n, g -> 00 . Under these two events, we have 




n^Ci,ij/ n i +n 2 ( 2 tij 


- 1 


< 


EXi(4b ) 2 


n 2 (i,ij/ n i 


- 1 


+ 


E 711+712 (y V 

p=rn+l y^Pti) 


112(2, 


- 1 


V 


< C 


logg 


(E.26) 


By the self-normalized large deviation for independent variables in Jing et al. (2003), we have 


( 711+712 711+712 x 

( Y, 4b) 2 / E (4b) 2 > 4 < Cl(l - m), (Ei.27) 

0=1 /3=1 2 

uniformly for t E [0, ©(n 1 / 6 ^)] for an arbitrary positive number e. Combining (E.24), (E.26) and 
(E.27), as n, q -+ 00 , we have 


PI max —^ 
. (*d)eA 


(rj 1 , Xa=l 9 »i(-^«) 719 XXXl gij(Xa) Ul,ij+Ul,ijj 


Cl,b/ n l + (2,ij/ll2 

711+712 711+712 


> r 


< 


/ /tlT »2 7 4-1 T/t 2 \ 

|A| m^ F(( E +.«) 2 /E (+,«) 2 >(l+Cj^-)i 2 ) <C|A|(1-*(()). 

V B=\ 8=1 2 


0=1 0 = 

This completes the proof of Lemma D.2. 


e 
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E.3 Proof of Lemma D.3 


Proof. Lemma D.3 is the same as Lemma D.2 except that in Lemma D.3 we use c 2 iPS /4 to replace 
Ca,ij used in Lemma D.2. This motivates us to write 


ni n 2 „ 

( E 9 lj{X a )/ni - X) gJj(Y a )/n 2 - T Uj ; r 2jj ) 

a=l a=l 

^gps/ 4 ™! + CJ l,p S / 4rl 2 

n i n 2 9 

( E 9lj(Xa)/ni ~ E 9jj{Ya)/n 2 - r Mi + 75 ,ij) 

_ g=i _ 0=1 __ + (, 2 ,ij/n 2 

+ ( 2 ,ij/n 2 + <72,ps/ 4n 2 ’ 


where is defined in (C.37) and £ 1 ^ and £ 2 ,^ are variances of gJ 3 {X a ) and gf-{Y Q ). By the 
definitions of gf- and ( a ,ij, under Assumption (A2), we have 0 < r a < ( a ^j < 1. We then have 


0 < (Ci ) 2 < 


^l.ps/ 4 ^! + °2, P s/ 4r D < crj ps / 4 m + cr| ps / 4 n 2 


l/ni + l/n 2 Ci,ij/ni + ( 2 ,ij/n 2 

Therefore, by Lemma D.2, we have 

(rL EoLl 9ij(X a ) - ^ Eo=l 9ij(y a ) - T hij + T\ t ij) 2 


max 
(id) 6 A 


< 


a i P J( An i) + CT !, P s/( 4 ri 2 ) 


> r 


max 

(m)£A 


EoLl 9ij{X a ) - ± Eo 2 =l 9ii(Xa) - Tl,ij + n 


,IJ) 


(E. 28 ) 


> (Ci) 2 t 2 


Ci,*j/ n i + C 2 ,ij/n 2 


<C'|A|(l-$(C' 1 i)). 


This completes the proof of Lemma D.3. 


□ 


F Some Useful Technical Lemmas 

In this appendix, we introduce the following technical lemmas that we use in our proofs. 

Lemma F.l. (Hoeffding’s inequality (Hoeffding, 1963)). X]...., X n are independent random 

variables and A* takes its value in [a*, bf\. Letting S = EEi — E[Xj]), we have 

P(S > x) < exp ( - ra -T 2 ), 

for any positive i£t. 

To present some useful results on U-statistics, we set X \...., X n to be independent and identi¬ 
cally distributed random vectors in M rf . 4>(xi,..., x m ) is a kernel function of m(< re) vectors x 7 = 
(.t 7 i, ..., x 1( f) T ■ The U-statistic is defined as U := X • • •, Xi m )/n(n — 1) • • • (re — rre + 1), 

where the summation is taken over all m-tuples °f distinct positive integers not exceed¬ 

ing re. 
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Lemma F.2. (Exponential inequality for bounded U-statistics (Hoeffding, 1963)). If the kernel 
function 4> is bounded, i.e., a < 4>(xi, • • • , x m ) < b, we have 

P (U - E[t7] >t)< exp(-2 kt 2 /{b - a)’ 2 ), 


where k = \ n/m\. 

We define U = (Ui,... ,Uk) £ as a A'-dimensional U-statistic with the kernel function 
<&(•) = (<hi(-),..., &k{-)) T £ E K , where & k (-) is defined as 

$*.(■) : R d x • • • x —>• M for 1 < k < K. 

m(k) 

Letting cj) k := E[$ fc (Xi,..., X m(fc) )], we define 


C ki 


E 


P[^fc(Xi,..., X m{k) ) - (j) k \Xi]E[^i(Xi ,..., X mW ) - <f>e\Xi] 


Hence, by setting S u = (m(k)m(£)( k e) € M AxA , we introduce the central limit theorem for U- 
statistics. 


Lemma F.3. (Central limit theorem for U-statistics (Hoeffding, 1948)). We assume & k (') is a 
symmetric kernel for 1 < k < K. If E [<b|(Xi,..., X m (j.))] exists for k = 1,..., K, as n —> oo, we 
have 


- 01),..., V^(U K - (Pk) T -> N( 0, £ u ), 

where N( 0, S u ) is the distribution of a normal random vector with mean vector 0 and K x K 
covariance matrix X! u . 


To get the central limit theorem, we need the Hoeffding decomposition. The Hoeffding decom¬ 
position divides a U-statistic into two pieces. One is a sum of i.i.d random variables. The other is 
a small residual term. For notational simplicity, for i = 1,..., n, we set 

g(X e )= K[$(X il ,...,X em )\X e ] and h(X e ) = g{X e ) - <f>, 


where (j) = E[g(X()\ and l £ {£i,.. .£ m }. 

Lemma F.4. (Hoeffding Decomposition (Hoeffding, 1948)). If U is a U-statistic with a symmetric 
kernel function <f>(xi,X 2 ,... ,x m ), we can decompose U as 


u=™J2h(x e ) + 

n 


t=\ 


n 


m 


-l 


A, 


where we set 


An = ^ ($(X tl ,...,X e 

1 <h<l 2 <...<im<n 


Em**,))- 


k =1 
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In Lemma F.4, m ^)” =1 h{Xp)/n is a sum of i.i.d random variables and (”) 1 A n is a small 
residual term. Therefore, we can use m'^2'f =l h(Xi)/n to approximate U. Combing Lemmas F.3 
and F.4, we know that U and m ^? =1 h(Xj>)/n have the same limiting distribution as n —> oo. 
Next, we introduce a lemma on the tail probability of a normal distribution. 


Lemma F.5. If £ follows a standard normal distribution, we have 


t 

t 2 + 1 


^L e -* 2 /2 < P(£ > t) < 

V Z7T 


1 . 1 1 2 /2 

t ^ 


for any t > 0 . 
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